AI Factory Deployment Engineer

NVIDIA Ltd.
Santa Clara, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 201K

Job location

Santa Clara, United States of America

Tech stack

PHP
Artificial Intelligence
Automation of Tests
Communications Protocols
Computer Networks
Computer Graphics
Data Centers
Data Integration
Data Center Infrastructure Management (CIM)
Document Management Systems
Supervisory Control and Data Acquisition (SCADA)
Python
Machine Learning
Modbus
Message Queuing Telemetry Transport (MQTT)
Package Development Process
Standard Sql
OPC Unified Architecture
Software Engineering
Transmission Control Protocol (TCP)
Data Strategy
Information Technology

Job description

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for 30 years. It's an outstanding legacy of innovation that's motivated by extraordinary technology-and outstanding people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work. Come join our team and see how you can make a lasting impact on the world.

NVIDIA's AI Factories (e.g. data centers) host ground-breaking products across high-performance computing to machine learning applications for autonomous vehicles and healthcare. At the heart of our AI Factory is the ability to engineer mechanical and electrical designs in close coupling to NVIDIA's industry-leading GPU and DSX codesigns. We are seeking an AI Factory Controls and Monitoring Engineer to support control system deployments.

What you'll be doing:

  • Collaborate with product owners and technical leads to identify and collect requirements for our next-generation data centers.
  • Support the global design standards for the data center controls and monitoring (DCCM) system, collaborating with internal teams to develop an execution strategy and life cycle management
  • Responsible for adapting control system reference designs and standards to AI Factory deployments.
  • Key collaborator responsible for control system technical evaluation from site selection due diligence through site turnover to operations including: contractor selection, bid package development, MEP or equivalent experience and control system composition review, RFI response, submittal/as built reviews, and commissioning support.
  • Provide technical support to DC operations controls engineers
  • Support IT to OT data integration enabling digital twins, agentic AI onboarding, coordinated leak detection and other applications.
  • Support standardization in controls engineering quality approval, process control, product evaluation, vendor proposals, evaluate product reliability, automated testing and software *
  • Collaborate with cross functional teams to make modifications to control settings and alarm thresholds to manage the data center space.

Requirements

  • Have excellent interpersonal and leadership skills will be critical for success: success depends on building rapport and credibility with multiple stakeholders across the organization
  • BS in Engineering, CS or equivalent experience
  • 8+ years of experience with control system design, development and management on industrial or mission critical systems
  • Working knowledge of mechanical, electrical, life safety, and IT Networking systems associated with critical environments
  • Understanding of OPC-UA, and Modbus (TCP & RTU) protocols and how to integrate using these protocols.
  • Troubleshooting, problem-solving skills and experience driving root cause analysis to complex projects under pressure
  • Experience with equipment commissioning, testing, or related activities
  • Experience with startup and configuration of Programmable Logic Controllers (PLCs) and SCADA workstations.
  • Strong understanding of Sequence of Operations (SOO) for mechanical system control. Ability to create and iterate on SOOs.

Ways to stand out from the crowd:

  • Experience with MQTT communication protocol, higher level data strategies, and integration to IT systems
  • Strong understanding of data center commissioning including Level 1 through Integrated Systems Testing
  • Strong understanding of document control and change control processes
  • Working knowledge and experience with Data Center Infrastructure Management (DCIM), EPMS systems, Ignition SCADA software development and deployment, and programming languages: Python, PHP, SQL
  • Working knowledge of data center power and cooling solutions, including advanced systems such as liquid cooling

Benefits & conditions

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 253,000 USD for Level 4, and 184,000 USD - 287,500 USD for Level 5., $25.00 - $35.00 per hour

About the company

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you!

Apply for this position