Infrastructure Engineer - Compute
Role details
Job location
Tech stack
Job description
Infrastructure Engineers at Nscale sit inside the Operational engineering team. The Operational engineering team is responsible for the design, implementation, operation, and continuous improvement of the infrastructure stack that underpins all internal and customer-facing services. This includes all components below the hypervisor, with a strong focus on OpenStack, storage systems, Proxmox, and critical supporting services such as DNS, DHCP, and infrastructure automation.
This team ensures high levels of availability, scalability, automation, and security for the infrastructure layers they own.
This team acts as a 3/4th line escalation point for support organisations, as well as providing subject matter expertise to pre-sales and other groups within the organisation.
What You'll be Doing (Responsibilities)
- Designing, implementing, and operating scalable and resilient infrastructure platforms, with a strong focus on OpenStack, Proxmox, Ceph, and supporting critical services such as DNS, DHCP, and configuration management.
- Continuously improving automation for provisioning, monitoring, patching, and recovery using infrastructure-as-code and configuration management tools.
- Collaborating with internal teams to ensure infrastructure solutions meet performance, availability, and security requirements.
- Acting as a 3rd/4th line escalation point for complex infrastructure issues, and working closely with support teams to resolve problems and identify root causes.
- Contributing to infrastructure roadmap planning, including capacity management, performance tuning, and introducing new technologies.
- Supporting pre-sales and solution design efforts by providing technical expertise on infrastructure capabilities and best practices.
- Ensuring all infrastructure platforms adhere to compliance, security, and operational standards.
- Participating in on-call rotations and incident response activities for critical infrastructure services.
Requirements
- Expert level experience with Linux systems administration
- Strong experience of designing and building automation of both physical and virtual infrastructure using tools like Ansible.
- 5+ years of experience scripting in Python or Bash
- Strong experience of deploying, managing, upgrading and operating large OpenStack clusters.
- Strong experience of deploying managing automating Proxmox
- Extensive troubleshooting experience of linux and services running on linux
- Understanding of datacenter operational best practices for power, cooling, and high-density compute.
- Ability to collaborate across global engineering and operations teams.
Nice to have:
- Knowledge of Ironic
- Knowledge of Neutron/OVN/OVS
Benefits & conditions
- Highly competitive package (base + equity) with reviews every 12 months.
- Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI.
- Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
- Human-First Flexibility: We treat you as humans first. Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work