Senior Applied Scientist - Machine Learning Systems Engineer- Photoshop
Role details
Job location
Tech stack
Job description
- Inference & Serving Optimization: Design and optimize high-throughput, low-latency inference systems. Optimize model architectures to improve deployment and runtime efficiency using techniques such as distillation, pruning, quantization, and Mixture-of-Experts (MoE). Implement advanced serving strategies including batching, caching, quantization (FP8/INT8), and distributed inference strategies covering data, tensor, pipeline, expert, and hybrid parallelism, while balancing computation and communication efficiency.
- Kernel Development & System Acceleration: Write and maintain high-performance GPU kernels using Triton or CUDA to accelerate custom model layers and critical workloads. Improve GPU utilization through Kernel fusion, asynchronous pipelines, and optimized scheduling strategies.
- Performance Profiling & System Optimization: Conduct deep performance analysis using tools such as PyTorch Profiler and NVIDIA Nsight to identify bottlenecks in compute, memory, and communication, and optimize end-to-end system performance across inference workloads.
- Distributed Systems & Infrastructure Collaboration: Partner with infrastructure teams to design scalable and reliable distributed serving systems across heterogeneous hardware environments (eg, A100, H100, B200, CPU). Contribute to resource scheduling, GPU pooling, and elastic workload management.
- Cost-Aware ML Engineering: Establish and track efficiency metrics such as cost per million inferences. Build benchmarking frameworks and dashboards to guide trade-offs among quality, latency, and compute cost, enabling data-driven system and product decisions.
- Technical Leadership & Best Practices: Serve as a trusted technical advisor to research and product teams on efficiency trade-offs. Define best practices for scalable and cost-efficient ML development and mentor other engineers on performance-oriented systems design.
Requirements
Photoshop ART is seeking a Senior Machine Learning (ML) Systems & Efficiency Engineer to join our R&D team focused on delivering practical, production-ready improvements in inference performance, latency, and cost efficiency across image editing applications. This role sits at the intersection of model architecture, systems, inference runtimes, and services, with a clear mandate: deliver high-quality ML systems at substantially lower cost and higher efficiency. Individuals in this role are expected to have deep expertise in areas such as Artificial Intelligence (AI), ML systems, and computer vision. Strong preference will be given to candidates with experience in distributed inference, multimodal model profiling, and performance optimization. You will work closely with research, product, and infrastructure teams to influence model design decisions, improve GPU utilization, and build scalable, cost-aware ML systems deployed in production.
This is a hands-on, high-leverage role where a single engineer can drive outsized impact, potentially saving millions of dollars in compute costs. The ideal candidate will have a strong interest in developing practical innovations that advance Adobe products., * Education: Master's or PhD in Computer Science, Electrical Engineering, or a related field, with a focus on machine learning systems, distributed systems, or high-performance computing.
- Distributed Inference & Serving Expertise: Hands-on experience implementing and scaling large-scale inference or serving workloads using distributed frameworks and runtime systems (eg, Triton, vLLM, SGLang, xDiT, or similar). Experience applying inference compilation and optimization tools (eg, TensorRT, ONNX Runtime, AOTI), including techniques such as operator fusion and graph-level optimization, with a strong understanding of system-level performance trade-offs.
- GPU & Performance Engineering Skills: Strong understanding of GPU architecture (eg, memory hierarchy, compute throughput, communication bandwidth) and practical experience diagnosing performance bottlenecks across compute, memory, and I/O subsystems.
- Programming & Systems Development: Proficiency in Python and C+, with experience building high-performance or distributed systems. Familiarity with CUDA or Triton for performance-critical workloads is highly desirable.
- Data-Driven Engineering Mindset: Demonstrated ability to make engineering decisions based on rigorous measurement and benchmarking, focusing on improving system efficiency, scalability, and reliability in production environments.
Preferred Experience
- ML Frameworks & Tooling: Experience contributing to or maintaining performance- or efficiency-focused libraries or systems. Hands-on experience with open-source serving frameworks (eg, vLLM, SGLang, xDiT or similar), inference compilation tools (eg, TensorRT, Triton, AOTI, or equivalent, operation fusion, or graph-level optimization), and GPU profiling and performance analysis tools (eg, PyTorch Profiler, NVIDIA Nsight, CUDA tooling).
- Distributed Systems & Communication: Exposure to low-level communication libraries such as NCCL and a practical understanding of collective operations (eg, AllReduce, AllGather) in large-scale distributed serving environments.
- Containerization & Cluster Operations: Familiarity with containerized workflows (Docker, Kubernetes) and job scheduling in headless Linux environments, including experience operating production ML workloads on shared GPU clusters.
- Model Architectures: Working knowledge of model architectures such as Transformers, multimodal models, Mixture-of-Experts (MoE), or Diffusion Transformers (DiT).
Benefits & conditions
Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on the defined markets. The U.S. pay range for this position is $164,000 - $313,300 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process. In California, the pay range is $216,400 - $313,300. In Washington, the pay range is $204,800 - $296,600. Equal Employment Opportunity Statement
Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic. State-Specific Notices
California: Adobe will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and fair chance ordinances.