AI Infrastructure Engineer

LEVY PROFESSIONALS

Amsterdam, Netherlands

3 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Amsterdam, Netherlands

Tech stack

Artificial Intelligence

Azure

Cloud Computing

Continuous Integration

Linux

DevOps

Monitoring of Systems

Python

Machine Learning

Ansible

Prometheus

AI Infrastructure

Scripting (Bash/Python/Go/Ruby)

Large Language Models

Grafana

Gitlab-ci

Kubernetes

Infrastructure Automation Frameworks

Machine Learning Operations

Hardware Infrastructure

Terraform

Jenkins

Job description

Design, build, and maintain a scalable platform for serving LLM workloads in production
Deploy and manage containerised workloads on Kubernetes, including GPU-based infrastructure
Implement and optimise model serving solutions (e.g. vLLM, Triton, TGI)
Set up monitoring and observability using tools such as Prometheus and Grafana
Build and improve CI/CD pipelines and automate infrastructure using Python and Infrastructure as Code

Requirements

You are a platform or DevOps engineer with strong experience running complex systems in production, ideally with exposure to AI/ML infrastructure and large-scale environments. You understand how to operate workloads reliably at scale and are comfortable working with modern tooling across cloud, Kubernetes, and automation. You are focused on infrastructure and platform engineering rather than data science, with a strong emphasis on reliability, performance, and operational excellence., * Strong experience with Kubernetes in production and solid Linux systems knowledge

Hands-on experience with GPU infrastructure (e.g. NVIDIA A100/H100) and LLM/ML model serving
Experience with CI/CD tools (Azure DevOps, GitLab CI, Jenkins) and Python scripting
Familiarity with monitoring tools (Prometheus, Grafana) and infrastructure automation (Terraform, Ansible)
Experience in regulated environments or cost optimisation for high-performance workloads is a plus
We specifically need people who have worked with large language models and GPU-based inference at scale