Senior ML Accelerator Engineer - GPU
Role details
Job location
Tech stack
Job description
- Designing and implementing custom operators when vendor libraries hit their limits
- Integrating those kernels deep into our ML runtime stack
- Debugging and tuning GPU performance across the AV software stack, often on hardware-in-the-loop
We partner closely with AI Solutions, AI Compilers, AI Architecture, and AI Tooling to ensure models deploy efficiently to the car while consistently meeting strict latency, throughput, and reliability targets. If you enjoy pushing GPUs to their limits and seeing your work directly impact how autonomous vehicles perceive and act in the world, this is the team for you.
What you'll be doing (Responsibilities)
- Design, implement, benchmark, and iterate on CUDA-based kernels and custom operators to squeezeevery lastdrop of performance out of on-vehicle inference workloads.
- Build and improve tooling and infrastructure that make it easier to profile, debug, andvalidateCUDA kernels and accelerator-backend code across the AV stack.
- Partner with AI Solutions, Compilers, and Architecture to translate model and system requirements into concrete kernel roadmaps, priorities, and project plans.
- Collaborate with cross-functional teams (compiler, performance tooling, runtime, deployment solutions) to deliver reusable, reliable, high-performance libraries into production.
- Maintain hightechnology standards, methodologies, processes, and guidelines for GPU kernel development and performance engineeringthrough code review.
- Manage relationships with internal customers to ensure our kernels and libraries meet real-worldneeds
Requirements
- Minimum3+ years of relevant industry experience or equivalent experience
- BS, MS or PhD in CS, or related technical field
- Excellent GPU programming skills in CUDA, with a thorough understanding of parallel programming patterns and GPU architecture.
- Hands-on experience benchmarking, profiling,debuggingandoptimizingaccelerator libraries and kernels to extractoptimalperformanceusing theNSightsuite of tools or similar.
- Strong background in software architecture, librarydesign,and design patterns.
- Strong C++ programming skills with the ability to feel comfortable in large codebases.
- Solid background in system performance, high performancecomputingand/or architecture-aware optimizations.
- Strong communicationskills and the ability to work collaboratively within a team
- Excellent analytical and problem-solving skills
What Will Give You A Competitive Edge (Preferred Qualifications)
- Experience withtensor core programming,CUTLASS and/orCuTe
- Experience withML model architectures,in particulartransformer-based
- Experience with low latencyorreal time systems
- Experience withlower levelsof an accelerator software stack (i.e.drivers, runtimes, andcompilers)
Benefits & conditions
Compensation: The compensation information is a good faith estimate only. It is based on what a successful applicant might be paid in accordance with applicable state laws. The compensation may not be representative for positions located outside of New York, Colorado, California, or Washington.
- The salary range for this role: is $128,700 to $261,300. The actual base salary a successful candidate will be offered within this range will vary based on factors relevant to the position.
- Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance.
- Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more.