Staff / Principal Machine Learning Engineer, Serving
Role details
Job location
Tech stack
Requirements
A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we're not screening for a resume template - we're looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they've built, broken, and understood.
Experience We Find Useful
You don't need all of this. But you need enough to make a case.
- Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
- Model Acceleration. Hands-on experience with quantization, distillation, caching strategies , continuous batching, paged attention, and speculative decoding.
- High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs.
- Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
- Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
- Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.
- Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.
Who Thrives Here
- You don't need a roadmap to start walking; you're comfortable picking a direction and building the map as you go.
- You believe engineering isn't finished until it's shipped and stable. You have a bias for impact over purely theoretical optimizations.
- You don't just ship code; you obsess over the why. You're the first to question an architecture if you think there's a better way to solve the core latency or throughput problem.
- You aren't satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make.
Benefits & conditions
The base salary range for this full-time position is £140,000 - £200,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.