Senior MLOps Engineer - LLM Infrastructure y Performance Engineering

Mystery Project

Municipality of San Sebastian, Spain

2 months ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Municipality of San Sebastian, Spain

Tech stack

Application Layers

Systems Engineering

C++

Cloud Computing

Program Optimization

Nvidia CUDA

Computer Programming

DevOps

Distributed Computing Environment

Distributed Systems

Memory Management

General-Purpose Computing on Graphics Processing Units

Monitoring of Systems

Python

Machine Learning

Open Source Technology

Performance Tuning

Workflow Management Systems

High Performance Computing

PyTorch

Large Language Models

Backend

Containerization

Kubernetes

Slurm

Machine Learning Operations

Job description

We are partnering with a fast-growing deep-tech company working at the intersection of large-scale AI systems, high-performance computing, and next-generation model optimization.

The team focuses on building production-grade infrastructure for advanced AI models, solving real-world challenges where performance, scalability, and efficiency are critical.

This is a highly technical environment, ideal for engineers who want to operate close to the limits of modern GPU systems and large-scale ML workloads.

Role mission

You will take ownership of the infrastructure layer powering large-scale LLM training and inference, ensuring models are not only state-of-the-art but also efficient, scalable, and production-ready.

This role sits at the intersection of systems engineering and machine learning, with a strong focus on performance optimization and distributed systems.

What you'll be doing Design and scale distributed training pipelines for large language models. Optimize GPU utilization, memory usage, and training efficiency. Build and improve high-throughput inference systems for LLM serving. Implement advanced techniques to reduce latency and maximize throughput. Orchestrate workloads across cloud and on-premise environments. Define best practices for model lifecycle (training deployment monitoring). Perform deep performance analysis across the full stack (from low-level GPU to application layer). Drive engineering standards, mentor team members, and contribute to technical decisions.

Requirements

5+ years of experience in MLOps, DevOps, or backend/system engineering. Proven experience working with LLM infrastructure or large-scale ML systems. Strong expertise in: PyTorch ecosystem GPU computing (CUDA, distributed training) Experience with modern LLM tooling (training or inference). Solid background in distributed systems and performance optimization. Strong programming skills in Python (C++/Rust is a plus). Experience deploying systems in cloud or hybrid environments. Fluent English. Strong plus if you have Experience with: High-performance inference systems Model optimization (quantization, distillation, compression) HPC environments or large-scale clusters Familiarity with orchestration tools (Ray, SLURM, etc.) Experience with Kubernetes and containerized workloads Contributions to open-source ML infrastructure Experience with observability and monitoring systems Key competencies Systems thinking and performance mindset Strong problem-solving ability in complex environments Ownership and autonomy Ability to operate in high-performance teams Curiosity for cutting-edge AI infrastructure

Benefits & conditions

Work on LLM infrastructure at scale, not toy problems Direct impact on real-world AI systems used in production Highly technical environment with top-tier engineers Ownership of critical systems and architecture decisions Fast-paced, high-growth deep-tech setting Relocation & lifestyle The company will fully support you with all administrative, legal, and logistical processes to ensure a smooth relocation to San Sebastián. Initial housing support will be provided, including temporary accommodation and assistance with your home search. A unique opportunity to settle in one of the most desirable cities to live in Spain - offering an exceptional quality of life, with beaches, world-class gastronomy, surf, and a vibrant cultural scene.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

Apply for this position

Good distractions

Moments

Videos View all