Director, Rack Scale Software Architecture

NVIDIA Ltd.
Santa Clara, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 320K

Job location

Santa Clara, United States of America

Tech stack

Artificial Intelligence
Computer Engineering
Computer Graphics
Data Centers
Microprocessors
Ethernet
Firmware
Field-Programmable Gate Array (FPGA)
InfiniBand
Node.js
Software Architecture
Cloud Services
Software Requirements Analysis
Systems Architecture
System Software
Graphics Processing Unit (GPU)
Computer Network Technologies
Information Technology

Job description

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

NVIDIA has a rapidly expanding ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These designs have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. Each bringing together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're searching for a highly technical, motivated manager to lead & manage the team responsible for rack-scale system software architecture. From firmware, kernel drivers, operating systems, networking, fabrics and associated user mode drivers + manageability software. You will work with component leads internally and engage with industry leading hyperscalar / cloud service providers on taking these products to market.

What you'll be doing:

  • Drive the software end-to-end architecture for NVIDIA's rack-scale products

  • Maintain deep understanding of the product portfolio and roadmap; translate forward-looking plans into clear, formal software requirements that anchor execution across the organization.

  • Ensure high quality & reliable software; serving as a trusted architectural partner to teams requiring guidance or oversight.

  • Work directly with major customers to understand their requirements and work to align their roadmap with NVIDIA's roadmap.

  • Using strong communication skills, present the team vision to senior NVIDIA and external leaders.

Requirements

  • BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.

  • 15+ overall years of experience in the area of System architecture and design with 8+ yrs of proven experience in management

  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.

  • Proven leadership skills and strong ownership on past projects involving a large scale sophisticated code base

  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs

  • Possess strong managerial, problem solving and critical thinking skills.

  • Comfortable operating in highly matrixed organizations while holding a leadership position

  • Known for your strong interactive, verbal and written communications skills

Ways to stand out from the crowd:

  • Knowledge of large-scale cloud and cluster level deployment and management systems. Experience with designing robust, resilient and performant scale-up fabrics

  • Demonstrated track record of leading data center products across the entire lifecycle, spanning inception, pre-silicon development, post-silicon bring-up, manufacturing, and deployment.

  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband). Familiarity with CXL, UCIE and other C2C technology architectures. Knowledge in storage and networking technologies.

We are widely considered to be one of the technology world's most desirable employers, and as a result have some of the most forward-thinking and hardworking people in the world working for us. So if you're clever, creative, and driven, we'd love to have you join the team.

Benefits & conditions

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 320,000 USD - 488,750 USD.

Apply for this position