Engineering Manager (ML Platform)

Zoox
Foster City, United States of America
19 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 317K

Job location

Remote
Foster City, United States of America

Tech stack

Cloud Computing
Machine Learning
Software Architecture
Large Language Models
Deep Learning
Machine Learning Operations
TensorRT
Hardware Infrastructure

Job description

  • Our growing Software Infrastructure engineering leadership team is looking for a Senior Engineering Manager, ML Platform
  • The centralized ML Platform team at Zoox plays a crucial role in enabling innovations across all our Autonomy and Data Science teams to develop and deploy models across our robotaxi and cloud infrastructure, and to work on cutting-edge training and inference optimization techniques
  • We are working on many interesting challenges to enable rapid experimentation and scale our multi-modal Foundation models and RL infrastructure, and ensure these models run efficiently on our vehicles, meeting our latency targets
  • You will get to work across all ML teams within Zoox - Perception, Prediction, Planner, Simulation, Collision Avoidance, and our Advanced Hardware Engineering group, and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox
  • We build and operate the base layer of ML tools, deep learning frameworks, and inference libraries used by our applied research teams for in- and off-vehicle ML use cases
  • You will lead a team of strong software engineers and managers and act as a force multiplier for our internal customers
  • This team has many growth opportunities as we expand our robotaxi deployments and venture into new ML domains
  • Vision: Develop and execute a strategic vision for our ML training platform, ensuring scalability, reliability, and performance to support large-scale Foundation and RL models
  • Technical acumen: Lead the design, implementation, and operation of a robust and efficient ML training platform to enable the training, experimentation, validation, and monitoring of ML models
  • Hiring: Attract, hire, and inspire a diverse world-class engineering team, fostering a culture of innovation, collaboration, and excellence
  • Partnership: Collaborate closely with cross-functional teams, including ML researchers, software engineers, data engineers, and hardware engineers to define requirements and align on architectural decisions
  • Mentorship: Enable the engineers in the team to grow their careers by providing the right opportunities along with clear and timely feedback

Requirements

  • Experience with training frameworks like PyTorch, JAX, etc., leveraging GPUs for distributed model training
  • 10+ years of relevant experience, including 4+ years of management experience managing other managers and engineers
  • Experience with GPU-accelerated inference using TensorRT, Ray Serve, or similar frameworks
  • Experience building user-friendly ML Infrastructure that enabled large-scale model training and high-throughput, low-latency serving use cases

Benefits & conditions

  • Paid parental leave
  • Affinity groups and sports clubs
  • Work from home opportunities
  • Health insurance
  • Our crew's health and happiness is our first priority. We offer comprehensive health and mental health support, a wellbeing program, and unlimited and flexible paid time away
  • We invest in our crew-and their families-for the long term. That includes generous family planning support, caregiver support, and strong cash compensation with great equity upside
  • We look after our crew when they're in the office too. Our famous food program is a great example, featuring a daily changing menu of local and sustainable dishes
  • There's a busy calendar of social events at Zoox, with more sports teams than you can count. And, of course, playing with robots is an important part of the job description

Apply for this position