Staff Machine Learning Services Engineer
Role details
Job location
Tech stack
Job description
You will design and implement APIs and services that integrate Adobe's proprietary models and third-party models. As a hands-on technical leader, you'll define the architectural direction, uphold high engineering standards, mentor peers, and drive the productization of a rapidly growing GenAI portfolio. You'll be joining a committed, multi-skilled team working on the foundation for some of Adobe's most groundbreaking creative tools.
What you'll do
-
Design, build, and maintain scalable, production-grade GenAI services in the cloud
-
Lead architectural design and guide the development of optimized ML pipelines for cloud-based inference and integration
-
Drive the bridge between research and product teams throughout the lifecycle-from prototyping to deployment
-
Optimize model performance, GPU utilization, and service orchestration at scale
-
Work collaboratively with PMs, TPMs, and engineering leads to build and implement the GenAI roadmap
-
Define technical direction, conduct building reviews, and establish guidelines for reliability, scalability, and maintainability
-
Own and improve CI/CD systems and monitoring pipelines for ML services
-
Mentor engineers across ML and backend fields, encouraging a culture of technical excellence
-
Provide tier-1 production support, ensuring service SLAs and customer satisfaction
Requirements
-
MS or PhD in Computer Science or a related field, or equivalent experience in the industry.
-
8+ years' experience architecting cloud services and infrastructure for large scale use, reliability and performance
-
3+ years of experience with GenAI workloads-including fine-tuning and inference at scale
-
Strong foundation in Transformer architecture, Diffusion models, CLIP, VAE, Encoder/Decoders
-
Deep understanding of model serving, orchestration, and GPU resource management in distributed environments
-
Knowledge of model optimization techniques (quantization, pruning, distillation, etc.)
-
Expert-level coding and debugging skills in Python; familiarity with JavaScript/TypeScript is a plus
-
Hands-on experience with Kubernetes, Docker, and ML-Ops platforms (e.g., MLflow, KServe, Triton)
-
Familiarity with CUDA, Torch AOTinductor, and frameworks such as PyTorch, TensorFlow, or ONNX Runtime
-
Proven track record of leading complex, high-stakes technical initiatives across teams
-
Strong problem-solving skills and a mentality for developing well-tested, production-quality software
-
Excellent communication and multi-functional collaboration skills
Benefits & conditions
Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $172,500 -- $306,625 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.
In California, the pay range for this position is $211,800 - $306,625In Washington, the pay range for this position is $201,000 - $291,150
At Adobe, for sales roles starting salaries are expressed as total target compensation (TTC = base + commission), and short-term incentives are in the form of sales commission plans. Non-sales roles starting salaries are expressed as base salary and short-term incentives are in the form of the Annual Incentive Plan (AIP).
In addition, certain roles may be eligible for long-term incentives in the form of a new hire equity award.