Senior IT Infrastructure Engineer
Role details
Job location
Tech stack
Job description
As an IT infrastructure engineer, you will be responsible for establishing infrastructure from the ground up, capacity planning, disaster recovery, and day-to-day operations. You will manage, configure, and monitor our IT infrastructure, including automated backups; ensure the security and availability of resources; and work closely with engineering and operations teams to provide a robust, scalable IT infrastructure that supports our AI and robotics development workflows., * Infrastructure Architecture & Operations: Design, implement, and maintain on-premise IT infrastructure (compute, storage, networking). Perform capacity planning and develop/execute backup and disaster recovery strategies. Maintain comprehensive infrastructure documentation.
- Physical Data Center & Cloud Infrastructure: Manage and monitor on-premise IT facilities (servers, cooling, power) and hardware. Design and provision storage and compute/GPU infrastructure for high-performance workloads (ML/AI).
- Enterprise Networking: Design and implement WAN/LAN/WiFi network topology with proper segmentation and security controls (firewalls, IDS/IPS). Configure and manage enterprise networking equipment (switches, routers, load balancers).
- System Administration & Support: Deploy and manage Linux server infrastructure. Configure and deploy employee workstations (Linux, macOS, Windows) and manage IT equipment procurement. Provide technical troubleshooting and support. Manage user accounts with SSO.
- Vendor Management: Establish and manage relationships with technology vendors, negotiate contracts, and coordinate with service providers (ISPs, colocation).
Requirements
Do you have experience in macOS?, * Proven track record in building or transforming infrastructure.
- Deep expertise in enterprise networking (WAN/LAN, VLANs, routing, switching, firewalls, VPNs).
- Strong hands-on experience with server hardware assembly, configuration, and maintenance.
- Expert knowledge of storage (RAID, SAN/NAS) and backup/recovery solutions.
- Experience with Linux server administration and troubleshooting.
- Solid understanding of data center operations (power, cooling, security).
- Hands-on experience provisioning and managing GPU infrastructure.
- Scripting skills (Python, Bash) for automation.
- Experience with Infrastructure-as-Code (Terraform, Ansible).
- Strong problem-solving and troubleshooting skills for complex hardware and network issues.
- Excellent documentation and communication skills.
- Self-motivated and able to work independently in a fast-paced environment.