Senior Machine Learning Engineer

Strativ Group

Atherton, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Atherton, United States of America

Tech stack

Artificial Intelligence

Nvidia CUDA

Distributed Computing Environment

Machine Learning

AI Infrastructure

PyTorch

Kubernetes

Low Latency

Slurm

Machine Learning Operations

TensorRT

Requirements

The strongest profiles will have experience across ML systems, AI infrastructure, model serving, distributed training or inference infrastructure.

TPU experience is highly valuable, but strong GPU systems experience is also relevant.

Useful backgrounds include work with Kubernetes, Ray, Slurm, PyTorch, JAX, CUDA, Triton, vLLM, SGLang, TensorRT, Bazel or low-latency serving systems.

They are especially interested in exceptional early-career engineers, including standout new grads, PhD candidates or engineers with a few years of industry experience at a top AI lab, infrastructure company or high-performance startup.

You could be a fit if

You have built or optimised production ML systems.

You care about latency, throughput, reliability and cost.

You are comfortable working close to the metal, across infrastructure and model execution.

You want a high-intensity founding environment where your work directly shapes the company.

You prefer building over politics and want to work with a small team of highly technical founders.

Benefits & conditions

Compensation: highly competitive, with cash compensation potentially reaching $500k to $600k+ for exceptional candidates

About the company

We are working with an early-stage AI systems company in Palo Alto building infrastructure for the next generation of agentic AI workloads. The company is developing a platform that combines high-performance model serving with self-improving AI systems. Their near-term product focuses on TPU and GPU serving infrastructure, while the longer-term vision is an agent cloud where models, tools and infrastructure improve together over time. This is a founding engineering role for someone who wants to work directly with the founders on hard infrastructure problems. You will help build the systems layer for agentic workloads, with a focus on inference performance, serving reliability, cost reduction and high-velocity deployment. What you'll work on You will help design and build model-serving infrastructure for demanding AI workloads. You may work across distributed inference, scheduling, runtime optimisation, deployment pipelines, observability and infrastructure for self-improving agents. You will be expected to operate with high ownership, make technical decisions quickly and ship production systems in a small team environment.

Role details

Job location

Tech stack

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all