Senior Machine Learning Engineer

Strativ Group
Atherton, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Atherton, United States of America

Tech stack

Artificial Intelligence
Nvidia CUDA
Distributed Computing Environment
Machine Learning
AI Infrastructure
PyTorch
Kubernetes
Low Latency
Slurm
Machine Learning Operations
TensorRT

Requirements

The strongest profiles will have experience across ML systems, AI infrastructure, model serving, distributed training or inference infrastructure.

TPU experience is highly valuable, but strong GPU systems experience is also relevant.

Useful backgrounds include work with Kubernetes, Ray, Slurm, PyTorch, JAX, CUDA, Triton, vLLM, SGLang, TensorRT, Bazel or low-latency serving systems.

They are especially interested in exceptional early-career engineers, including standout new grads, PhD candidates or engineers with a few years of industry experience at a top AI lab, infrastructure company or high-performance startup.

You could be a fit if

You have built or optimised production ML systems.

You care about latency, throughput, reliability and cost.

You are comfortable working close to the metal, across infrastructure and model execution.

You want a high-intensity founding environment where your work directly shapes the company.

You prefer building over politics and want to work with a small team of highly technical founders.

Benefits & conditions

Compensation: highly competitive, with cash compensation potentially reaching $500k to $600k+ for exceptional candidates

About the company

We are working with an early-stage AI systems company in Palo Alto building infrastructure for the next generation of agentic AI workloads. The company is developing a platform that combines high-performance model serving with self-improving AI systems. Their near-term product focuses on TPU and GPU serving infrastructure, while the longer-term vision is an agent cloud where models, tools and infrastructure improve together over time. This is a founding engineering role for someone who wants to work directly with the founders on hard infrastructure problems. You will help build the systems layer for agentic workloads, with a focus on inference performance, serving reliability, cost reduction and high-velocity deployment. What you'll work on You will help design and build model-serving infrastructure for demanding AI workloads. You may work across distributed inference, scheduling, runtime optimisation, deployment pipelines, observability and infrastructure for self-improving agents. You will be expected to operate with high ownership, make technical decisions quickly and ship production systems in a small team environment.

Apply for this position