Senior Machine Engineer, ML Systems and Infrastructure
Role details
Job location
Tech stack
Job description
The work we do at Autodesk touches nearly every person on the planet. By creating software tools for making buildings, machines, and even the latest movies, we influence and empower some of the most creative people in the world to solve problems that matter.
Autodesk is seeking a Senior ML Engineer, ML Systems and Infrastructure to design and scale the systems that enable machine learning across research and product development. You will help build the infrastructure behind large-scale data pipelines, distributed training systems, evaluation frameworks, and production ML workflows that support foundation models and ML-powered product features.
This role is ideal for an engineer who is deeply interested in scalable systems and production-grade ML infrastructure. You will operate independently across multiple parts of the stack and help define strong engineering practices for reliability, performance, and maintainability., * Design and build scalable systems for ML training, evaluation, deployment, and monitoring
- Develop and improve data pipelines that process large-scale structured and semi-structured technical datasets
- Optimize distributed workflows for performance, reliability, resource utilization, and cost efficiency
- Build platform capabilities such as experiment tracking, model versioning, checkpointing, reproducibility, and observability
- Contribute to model deployment, inference services, and production monitoring workflows
- Improve data quality, lineage, provenance, and operational transparency across ML pipelines
- Contribute to architecture and design discussions across the team
- Identify and resolve bottlenecks in data, compute, orchestration, and observability layers
- Mentor engineers through code reviews, design guidance, and knowledge sharing
- Collaborate closely with researchers, product engineers, and platform partners to turn ML workflows into robust engineering systems
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field, or equivalent industry experience
- At least 3 to 4 years of industry experience building and operating production software, ML systems, distributed infrastructure, or large-scale data pipelines
- Strong experience in software engineering, distributed systems, backend systems, or ML infrastructure
- Strong proficiency in Python and experience delivering production-quality systems
- Experience designing and operating scalable data or compute pipelines
- Experience with cloud platforms such as AWS, Azure, or GCP
- Familiarity with containers, CI/CD, observability, and release quality practices
- Ability to independently drive technical execution on complex work with limited oversight, * Experience building data pipelines for large-scale structured and semi-structured technical datasets
- Experience with data lineage, provenance, governance, and responsible data usage in ML systems
- Experience with distributed data processing and orchestration systems such as Ray, Airflow, Spark, or similar platforms
- Experience with model deployment, inference services, monitoring, and observability for production ML systems
- Experience building ML-ready representations for geometry, graph, hierarchical, or multimodal data
- Experience with distributed ML frameworks such as PyTorch, Lightning, DeepSpeed, FSDP, Megatron, or similar
- Familiarity with AEC workflows, design data, BIM/CAD formats, or Autodesk products, * Thinks like a systems engineer and executes like a strong software developer
- Can balance short-term delivery with long-term platform health
- Brings strong technical judgment and ownership
- Improves team effectiveness through mentoring and engineering rigor
- Enjoys solving scaling, performance, and reliability challenges