Senior Technical Program Manager, DGX Cloud...

NVIDIA Ltd.
Santa Clara, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 322K

Job location

Santa Clara, United States of America

Tech stack

Artificial Intelligence
Data analysis
Cloud Computing
Cloud Engineering
Continuous Integration
Distributed Systems
Infrastructure as a Service (IaaS)
Integrated Development Environments
Machine Learning
Software Engineering
Software Systems
System Software
Systems Integration
AI Infrastructure
Cloud Platform System
Deep Learning
Core Api
Functional Dependencies
Containerization
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Atlassian Tools
Serverless Computing
Microservices

Job description

DGX Cloud Team is looking for a Senior Technical Program Manager (TPM) to guide complex, cross-functional projects that support NVIDIA's next-generation AI infrastructure. This position involves leading software-related initiatives across cloud platforms, infrastructure services, and distributed systems. The role focuses heavily on cloud-native software delivery, Kubernetes-based platforms, and large-scale AI workloads.

You will be responsible for managing high-impact engineering programs within a dynamic, fast-paced roadmap, aligning priorities across teams, and ensuring timely and high-quality delivery. This role requires strong technical skill, a proactive approach, and the ability to operate effectively across multiple levels of the organization. We are specifically looking for a software TPM with strong Kubernetes experience who can help drive execution across platform software and cloud infrastructure.

What You'll Be Doing:

  • Lead the complete implementation of DGX Cloud software initiatives, encompassing planning, management, delivery, and operationalization across NVIDIA's cloud infrastructure.

  • Partner with software, infrastructure, product, and platform engineering teams to align on goals, architecture achievements, deliverables, and schedules.

  • Lead initiatives involving Kubernetes-based platforms, cloud-native services, platform APIs, and distributed systems that enable AI training and inference workloads.

  • Define and implement scalable program management processes, tools, and guidelines to ensure high execution velocity and program transparency.

  • Identify cross-functional dependencies, mitigate risks, and drive resolution of complex technical and programmatic issues across the software stack.

  • Establish clear success metrics and reporting mechanisms to track progress and communicate status to senior leadership.

  • Foster a culture of collaboration and continuous improvement across engineering, product, and operations teams.

  • Develop and implement metrics for assessing program efficiency and identifying areas for improvement, collect and analyze data to support planning and data-driven decisions.

  • Report on overall program status, providing insights and recommendations to senior management.

  • Drive organizational alignment and efficiency by coordinating with multi-functional leads and streamlining processes across software development lifecycles and release execution.

Requirements

  • Postgraduate degree in Computer Science, Artificial Intelligence, or equivalent experience.

  • 12+ years of program management experience, including proven ability managing global projects across multiple time zones.

  • Solid knowledge of cloud-native software systems, Kubernetes, containerized applications, microservices architectures, and infrastructure-as-a-service (IaaS) platforms.

  • Practical experience working with Kubernetes is required.

  • Proven experience driving large-scale software programs in fast-paced engineering environments.

  • Strong understanding of software engineering guidelines, release procedures, system integration, and platform delivery.

  • Proven experience creatively resolving technical issues and resource conflicts.

  • You should be detail oriented with proven ability to multitask in a dynamic environment with shifting priorities and changing requirements.

  • It is essential that you possess direct experience working within a dynamic software development environment.

  • Excellent communication and technical presentation skills.

  • Significant experience with large-scale Agile tools, reporting, and processes relevant to this role is required.

  • Demonstrated skill in engaging and moderating successful engagements with engineering, operations, and product teams.

Ways To Stand Out From The Crowd:

  • Strong background in Machine Learning, Deep Learning, and Artificial Intelligence applications.

  • Prior experience leading programs for Kubernetes platforms, cloud-native infrastructure, platform services, or developer platforms.

  • Experience with software release management, service operationalization, and large-scale platform adoption.

  • Familiarity with observability, CI/CD, infrastructure automation, and service reliability practices in cloud environments.

  • Consistent track record of driving process improvements and measuring efficiency.

  • Familiarity with NVIDIA platforms, products, and ecosystem is a plus.

Benefits & conditions

With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working with us and our engineering teams are growing fast in some of the most impactful fields of our generation: Deep Learning, Artificial Intelligence, and Autonomous Vehicles. If you're a hardworking individual who enjoys autonomy and shares our passion for technology, we want to hear from you. We are looking for great people like you to help us accelerate the next wave of artificial intelligence.

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 200,000 USD - 322,000 USD.

You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) .

Apply for this position