Senior AI Frameworks Engineer
NVIDIA Ltd.
Santa Clara, United States of America
yesterday
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 306KJob location
Santa Clara, United States of America
Tech stack
API
Artificial Intelligence
C++
Nvidia CUDA
Python
Graphics Processing Unit (GPU)
Parallel Computation
Information Technology
Requirements
- MS or PhD degree in Computer Science, Electrical Engineering, or related field (or equivalent experience).
- At least 3+ years of relevant experience.
- Strong proficiency in Python and C++, specifically regarding the design of Python extensions and foreign function interfaces (FFI).
- Experience in library or framework development, with a focus on creating intuitive APIs for complex technical systems.
- Deep understanding of the Python ecosystem's delivery stack, including building, testing, and distributing high-performance compiled extensions.
Ways to stand out from the crowd:
- Active maintainer status or significant contributions to high-performance open-source libraries, AI frameworks or compiler projects (LLVM/MLIR).
- Understanding of compiler foundations, such as intermediate representations (IR), lowering passes, or AST manipulation.
- Experience with GPU Architecture and parallel programming models (CUDA).
Benefits & conditions
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.
About the company
We are now looking for a Senior AI Frameworks Engineer (C++/Python)! NVIDIA's high-performance computing platforms are powering the AI revolution across many applications and industries. Within our software stack, CUTLASS stands out as a popular open-source ecosystem dedicated to high-performance math primitives. Since 2017, it has provided the community with C++ template abstractions to implement custom GEMM and related computations efficiently on NVIDIA GPUs.
We are building the next frontier of this ecosystem: Pythonic CUTLASS (CUTLASS DSL). This initiative aims to bring "speed-of-light" performance and powerful abstractions of our stack directly into the Python environment. Join the CUTLASS team and help bridge the gap between low-level hardware primitives and high-level developer productivity. If you are passionate about building elegant, high-performance DSLs and want to empower the next generation of AI researchers and engineers with better tools, apply today!
What you'll be doing:
As a core contributor to the CUTLASS project, you will use your expertise in systems programming and API design to create a world-class developer experience for GPU programming and kernel delivery.
* Design APIs that prioritize user productivity, providing a "native" feel for developers accustomed to modern scientific computing and deep learning frameworks.
* Develop robust compilation infrastructure-including AST transformations and JIT-friendly execution-to lower Pythonic descriptions into high-performance GPU machine code.
* Optimize developer experience by creating debugging tools, profiler integrations, and validation methodologies that make writing and using kernels easy.
* Build production-grade delivery infrastructure for the open-source community, managing everything from package distribution (wheels, conda) to the user-facing documentation and testing., NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous engineer, we want to hear from you!