Senior Infrastructure Engineer

NexGen Cloud
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Intermediate

Job location

Remote

Tech stack

API
Artificial Intelligence
Intelligent Platform Management Interface
Bash
Border Gateway Protocol
Ubuntu (Operating System)
Cloud Computing
Configuration Management Databases
Nvidia CUDA
Continuous Integration
Data Centers
Dynamic Host Configuration Protocol
Linux
DNS
Infrastructure as a Service (IaaS)
Virtual Private Networks (VPN)
Python
Kernel-Based Virtual Machine
Machine Learning
Network Architecture
Network Monitoring
Routing
OpenStack
Cloud Services
Ansible
Prometheus
TCP/IP
Visual Effects
Virtual Local Area Networks
Virtualization Technology
Data Logging
Computer Aided Engineering (CAE)
Scripting (Bash/Python/Go/Ruby)
System Availability
Grafana
Firewalls (Computer Science)
Containerization
Kubernetes
Low Latency
Bare Metal
Terraform
Cisco networks
Microservices

Job description

Overview NexGen Cloud is a rapidly growing IaaS company focused on providing innovative cloud solutions and infrastructure services. Our GPU cloud infrastructure solutions accelerate development in industries such as Artificial Intelligence & Machine Learning, VFX & Rendering, Data Science & IoT, and Computer Aided Engineering & MDO. We are dedicated to helping our clients navigate the complexities of the digital world and achieve success through cutting-edge, scalable, secure and affordable solutions. At the company's heart stands a group of very talented, experienced, and motivated individuals who want to make a positive change and a lasting impact on the tech world. Position Summary As a Senior Infrastructure Engineer, you will help design, deploy, and operate the systems that power our global GPU cloud. You'll bring deep expertise in Linux, networking, and automation to ensure our fleet is secure, scalable, and fast. This is a hands-on role ideal for engineers who love building and optimizing performance-critical infrastructure and who want to have a major impact at a rapidly scaling company. Responsibilities Core Infrastructure Provision and manage Linux systems (Ubuntu-based) supporting GPU servers and backend services. Maintain system availability, conduct root cause analysis, and implement failover strategies. Networking Design and manage high-speed, low-latency network infrastructure across data center environments. Configure firewalls, BGP, VLANs, VXLANs, and VPNs to support secure and scalable multi-tenant networking. Resolve network-related incidents impacting workloads or customer environments. Automation & Scaling Build infrastructure-as-code with tools like Ansible for repeatable, scalable deployments. Automate GPU driver installs, system bootstrapping, and fleet-wide patching. Develop CI / CD workflows for infrastructure updates and configuration validation. Cloud & Virtualization Support containerized workloads via Kubernetes or custom orchestration systems. Work with both bare-metal and virtualized GPU platforms using KVM or OpenStack-based environments. Integrate with public cloud APIs or hybrid infrastructure as needed. Monitoring & Security Deploy and manage monitoring stacks (e.g., Prometheus, Grafana, ELK) to track system health and capacity. Implement hardening practices, access controls, and audit trails for infrastructure components. Support incident response and security investigations related to infrastructure. Qualifications and Skills 3-5 years of experience in Linux systems administration or infrastructure engineering. Strong networking knowledge: routing, switching, TCP / IP, DNS, DHCP, VLANs, BGP, VPN. Proficiency with scripting languages (Bash, Python) and automation tools (Ansible, Terraform). Hands-on experience with virtualization, containerization, and systems troubleshooting. Familiarity with monitoring and logging systems in a production environment. Strong focus on keeping good documentation Knowledge of IB, BluField NIC, UFM, Openvswitch, software defined network or similar. Nice to have Prior experience at a GPU cloud provider, HPC environment, or similar high-performance setting. Exposure to NVIDIA GPU technologies and tooling (e.g., Nvidia GPU operator, CUDA toolkit, DCGM). Experience with software-defined networking (SDN, OVS / OVN) and overlay networks (VXLAN, Calico). Experience with networking products from Arista, Cisco, Mikrotik and Nvidia / Mellanox. Familiarity with OpenStack private cloud environments Familiarity with CMDB tools like Netbox Experience with working with Internet Registries (RIPE, AURIN) Knowledge of server provisioning via PXE / iPXE and out-of-band management tools (IPMI, Redfish) What We Offer Competitive salary. 100% home-office, working in US hours with some flexibility (Remote and based in : Spain / UK / Bulgaria / Albania / Portugal / Poland) Full-time permanent contract. Opportunity to work with a diverse team of talented professionals who are passionate about technology and innovation. A collaborative and supportive work environment that encourages professional growth and development. Exposure to cutting-edge technologies and the opportunity to make a significant impact on the future of cloud computing. Possibility to participate on international events. We encourage applications from candidates of all backgrounds and experiences. Our commitment to diversity and inclusion drives our success as a company and reflects our dedication to fostering a diverse and innovative workforce. Join our team and become a part of the NexGen Cloud Team, where innovation, collaboration, and growth are at the heart of everything we do. If you are a passionate, talented, and motivated individual looking to make a difference, apply now

Requirements

Qualifications and Skills 3-5 years of experience in Linux systems administration or infrastructure engineering. Strong networking knowledge: routing, switching, TCP / IP, DNS, DHCP, VLANs, BGP, VPN. Proficiency with scripting languages (Bash, Python) and automation tools (Ansible, Terraform). Hands-on experience with virtualization, containerization, and systems troubleshooting. Familiarity with monitoring and logging systems in a production environment. Strong focus on keeping good documentation Knowledge of IB, BluField NIC, UFM, Openvswitch, software defined network or similar. Nice to have Prior experience at a GPU cloud provider, HPC environment, or similar high-performance setting. Exposure to NVIDIA GPU technologies and tooling (e.g., Nvidia GPU operator, CUDA toolkit, DCGM). Experience with software-defined networking (SDN, OVS / OVN) and overlay networks (VXLAN, Calico). Experience with networking products from Arista, Cisco, Mikrotik and Nvidia / Mellanox. Familiarity with OpenStack private cloud environments Familiarity with CMDB tools like Netbox Experience with working with Internet Registries (RIPE, AURIN) Knowledge of server provisioning via PXE / iPXE and out-of-band management tools (IPMI, Redfish)

Benefits & conditions

What We Offer Competitive salary. 100% home-office, working in US hours with some flexibility (Remote and based in : Spain / UK / Bulgaria / Albania / Portugal / Poland) Full-time permanent contract. Opportunity to work with a diverse team of talented professionals who are passionate about technology and innovation. A collaborative and supportive work environment that encourages professional growth and development. Exposure to cutting-edge technologies and the opportunity to make a significant impact on the future of cloud computing. Possibility to participate on international events. We encourage applications from candidates of all backgrounds and experiences. Our commitment to diversity and inclusion drives our success as a company and reflects our dedication to fostering a diverse and innovative workforce. Join our team and become a part of the NexGen Cloud Team, where innovation, collaboration, and growth are at the heart of everything we do. If you are a passionate, talented, and motivated individual looking to make a difference, apply now

About the company

Overview NexGen Cloud is a rapidly growing IaaS company focused on providing innovative cloud solutions and infrastructure services. Our GPU cloud infrastructure solutions accelerate development in industries such as Artificial Intelligence & Machine Learning, VFX & Rendering, Data Science & IoT, and Computer Aided Engineering & MDO. We are dedicated to helping our clients navigate the complexities of the digital world and achieve success through cutting-edge, scalable, secure and affordable solutions. At the company's heart stands a group of very talented, experienced, and motivated individuals who want to make a positive change and a lasting impact on the tech world.

Apply for this position