Solutions Architect, AI Factory Infrastructure DevOps

NVIDIA Ltd.
Santa Clara, United States of America
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 242K

Job location

Santa Clara, United States of America

Tech stack

Artificial Intelligence
Computing Platforms
Linux
DevOps
Kernel-Based Virtual Machine
Reliability Engineering
Coupa Supplier Portal
Prometheus
Supercomputing
Virtualization Technology
AI Infrastructure
Grafana
Deep Learning
Kubernetes
Information Technology
Slurm
Docker
VMware

Job description

  • Help architect and scale high-performance, distributed AI infrastructure on-prem or in the cloud, built with the latest NVIDIA GPU supercomputers for new and existing customers.
  • Be a technical specialist on GPU and networking products, directly supporting sales account managers to secure build wins.
  • Actively establish and nurture technical relationships with engineers, management, and architects at key customer accounts.
  • Identify customer architectures and key product requirements in the CSP/OEM AI market to efficiently implement NVIDIA's solutions.
  • Provide on-site support to solve hardware and software problems, with a focus on deep learning inference.
  • Lead the product through its entire lifecycle, from design-in to end-of-life, ensuring detailed execution and customer satisfaction.
  • Actively maintain the NVIDIA side of infrastructure components and collect findings at the customer site.
  • Offer technical and sales training to direct sales teams and channel partners.
  • The expected travel requirement is approximately 25-30%.

Requirements

  • BS or MS in Engineering, Electrical Engineering, Physics, or Computer Science (or equivalent experience).
  • 5+ years of work-related experience in high-tech IT companies with experience in NCP, CSP, site reliability, and virtualization technologies (VMware, Linux KVM).
  • 4+ years of working experience with Kubernetes, Slurm, Docker, etc.
  • Proficiency with AI tools (Claud, Codex, Perplexity, etc.), Redfish, Grafana, and Prometheus.
  • Remarkable talent for effectively handling multiple initiatives and priorities.
  • Strong time-management and social skills for coordinating complex projects.
  • Excellent written and oral communication skills in English, with the ability to collaborate effectively with both management and engineering teams.

Ways to stand out from the crowd:

  • This role requires hands-on experience and extensive ability to solve problems within the customer infrastructure.
  • Kubernetes (K8S) is an infrastructure-orchestration software platform (NVIDIA Mission Control).
  • Practical knowledge of NVIDIA systems technology, such as DGX, GB200, and HGX systems, is a huge plus.
  • Experience working with OEMs in industrial, military, and ruggedized computing spaces.

Benefits & conditions

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

About the company

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. We seek a Solutions Architect to join our focused and hardworking AI Factory infrastructure deployment team. NVIDIA is at the forefront of the AI computing revolution, building innovative deep learning solutions that reshape industries worldwide. In this role, you will play a key role in introducing our advanced GPU products to deployments across data centers and edge computing. If you enjoy system building and have a demonstrable track record of technical customer interactions, this is a prime opportunity to make a meaningful contribution.

Apply for this position