Senior AI-Native Systems Software Engineer, TensorRT

NVIDIA Ltd.
Santa Clara, United States of America
8 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 242K

Job location

Santa Clara, United States of America

Tech stack

Artificial Intelligence
C++
Profiling
Nvidia CUDA
Computer Engineering
Python
Performance Tuning
Software Architecture
Rapid Prototyping Process
Software Engineering
Large Language Models
Multi-Agent Systems
Deep Learning
Generative AI
Build Management
Information Technology
TensorRT
Stable Diffusion
Virtual Agents
C++14
Software Performance

Job description

Are you passionate about redefining how software is built in the age of Generative AI? Join NVIDIA's TensorRT team to help lead a first-of-its-kind, AI-native initiative designed to make TensorRT the default entry point for out-of-framework inference globally. We are moving beyond traditional development cycles with a new framework built from the ground up to leverage swarms of AI agents to produce high-performance, high-quality, modern C++ software at an unprecedented scale.

If you are a systems-thinking C++ engineer who wants to help scale out an agentic development framework, stay on top of state-of-the-art deep learning breakthroughs, and improve users' experience with lightning-fast model onboarding, we want to hear from you!

What you'll be doing:

  • Architecting an AI-native framework: Help design and build a codebase and architecture that scales beyond human capacity, supporting large numbers of AI agents working in parallel to generate, test, and validate production-grade software.
  • Scaling through agentic workflows: Improve the ratio of compute-to-software output by adopting and building AI-native tools, multi-agent orchestrators, and codebase harnesses that keep humans focused on the highest-value work..
  • Rapid prototyping with SOTA models: Act as a technical scout, identifying industry and academic breakthroughs (e.g., new attention mechanisms, KV cache strategies) and dispatching AI agent swarms to prototype and integrate these capabilities into our framework.
  • Delivering a great user experience: Ensure a seamless, high-performance path to production for the latest model families (LLMs, Diffusion, Audio, Vision and multi-modal models).
  • Extreme performance optimization: Work at the intersection of Python orchestration and C++ engine-level optimizations to achieve major latency and throughput gains for critical customer use cases.

Requirements

  • BS, MS, or PhD in Computer Science, Computer Engineering, AI, or equivalent experience.
  • 4+ years of relevant software development experience.
  • Strong modern C++ skills: Proficiency with C++11/14/17 (or newer) and the STL, with an emphasis on clean, maintainable, performant code.
  • Deep learning familiarity: Experience with modern inference frameworks and an understanding of the architectural nuances of LLMs, Diffusion, and multi-modal models.
  • Systems thinking: Interest in how software architecture must evolve to support automated, agent-driven development and indefinitely scaling codebases.
  • End-to-end product sense: Ability to translate high-level customer needs into concrete technical requirements and user-centric solutions.
  • Pragmatic execution: Demonstrated ability to go from customer requests to production-quality software on tight timelines.
  • Collaborative mindset: Excellent communication skills and comfort working across internal organizations and with customers.

Ways to stand out from the crowd:

  • Agentic framework experience: Hands-on work with AI agent orchestrators or multi-agent coding frameworks, or experience building custom agentic coding harnesses for production software.
  • CUDA & kernel expertise: Experience with CUDA programming or exposure to kernel generation / autotuning efforts.
  • High-velocity prototyping: A track record of rapidly turning state-of-the-art papers into working prototypes in days, not weeks.
  • Performance profiling skills: Expertise in software performance analysis, profiling, and optimization (CPU and/or GPU), including using tooling to drive measurable wins.

Benefits & conditions

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, autonomous, and love a challenge, come join our team and help us build the future of high-performance AI inference technology!

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits .

Apply for this position