AI Infrastructure Engineer

BMW AG
München, Germany
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

München, Germany

Tech stack

API
Amazon Web Services (AWS)
Systems Engineering
Azure
Computer Clusters
Nvidia CUDA
Computer Programming
Continuous Integration
Distributed Computing Environment
InfiniBand
Python
Open Source Technology
AI Infrastructure
Graphics Processing Unit (GPU)
PyTorch
Grafana
AI Platforms
Information Technology
Slurm
Machine Learning Operations
Hardware Infrastructure
Data Pipelines

Job description

  • You will architect cloud and hardware solutions to scale AI workloads across GPUs and accelerators, optimizing storage and networking for maximum throughput and cost control.
  • Furthermore, you will engineer and operate end-to-end AI systems, focusing on fine-tuning and scalable serving of modern AI models.
  • You will support the MLOps layer by building tools for reliable deployment, real-time monitoring, and the continuous improvement of models throughout their lifecycle.
  • You will design and build scalable data pipelines and "flywheels" that ensure high-quality data availability and enable efficient feedback loops for continuous learning.
  • Additionally, you will design robust AI services and lead their integration into production-ready platforms, ensuring stability and performance., * Challenging projects with which we shape the mobility of tomorrow together.
  • Wide range of personal and professional development opportunities.
  • Attractive, fair and performance-related remuneration.
  • High level of job security.
  • Annual special payments such as vacation pay, Christmas bonus, and profit sharing.
  • Flexible working hours including six weeks annual leave and overtime compensation.
  • Discounted BMW & MINI conditions.
  • Many other benefits at bmw.jobs/benefits

Requirements

  • Bachelor´s or Master's degree in Computer Science, Systems Engineering, or equivalent practical experience
  • Hands-on experience with public cloud providers (AWS and Azure) and managing high-performance GPU clusters (e.g. configuring NVIDIA drivers, CUDA versions, and interconnects like InfiniBand/NCCL).
  • Strong programming skills in Python with focus on building infrastructure-as-code environments, APIs, CI/CD pipelines and observability tools.
  • ML Framework proficiency; operational knowledge of PyTorch specifically how to optimize them for distributed training (using tools like Ray and Slurm).
  • Open source contributions or relevant certifications are a plus.

About the company

Everything starts with passion at the BMW Group. It turns a profession into a vocation. It drives us to keep reinventing mobility and to bring innovative ideas onto the roads. Enthusiasm for joint projects turns a team into a strong unit where every opinion is valued. It is only when expertise, highly professional processes and enjoyment of work are united that we can shape the future together. Whatever your heart's desire - in the BMW Group, you will find a wide range of departments and disciplines across the world where you can share your professional passion with us. We are shaping the future of domain-specific AI systems at the BMW Group by designing, training and operating new founation models. Our team sets standards for the safe and scalable AI in engineering and production.

Apply for this position