AI /HPC Data Center Lab Engineer

Advanced Micro Devices, Inc.
Austin, United States of America
6 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Austin, United States of America

Tech stack

Microsoft Windows
Artificial Intelligence
Automation of Tests
Intelligent Platform Management Interface
Bash
BIOS
C++
Computer Programming
Computer Engineering
Continuous Integration
Data Centers
Software Debugging
Linux
Video Cards
Perl
Firmware
Issue Tracking Systems
Python
LabView
Networking Basics
PCI Express
Software Architecture
Ansible
Software Engineering
Tcl (Programming Language)
Test Execution Engine
Scripting (Bash/Python/Go/Ruby)
Computer Equipment
Information Technology
Jenkins

Job description

The Data Center Platform Engineering Group (DPEG) organization is looking for skilled individuals that can contribute to the bring-up, support and debug of complex system / SOC problems. Individuals will be part of a growing lab team and required to do hands on experiments related to issue management as well board level reworks. The position involves a wide range of activities including deploying pre-production platforms and test stations to enable silicon bring-up and validation, contributing to the triage, debug and rework required to resolve complicated system level issues. THE PERSON: The ideal candidate be a team player with strong communication skills and able to work in a dynamic environment. Must be a self-starter capable of working with minimal supervision and driving tasks to completion., * Setup hardware to facilitate remote/local test execution and user defined workloads in the DPEG validation lab

  • Own initial troubleshooting and debug of a wide variety of system, firmware, or software issues encountered while maintaining the integrity of a large number of development systems.
  • Reproduce issues and validate fixes identified by DCGPU platform leads and chief engineers.
  • Provide logs and statistics that will help in further debug of issues.
  • Integrate automated testing in CI/CD environment (e.g. Jenkins, Ansible)
  • Work within a managed ticketing system and communicate clearly on steps/activities

Requirements

  • Proven test bench setup experience with expertise in embedded systems
  • Able to read and interpret board schematics
  • Software Programming and scripting experience (Python, bash, C/C++) in Windows and Linux operating systems.
  • PC/server environment H/W and S/W setup and administration.
  • Basic networking skills
  • Comfortable working in different operating system environment including Windows and Linux
  • Excellent soldering skills
  • Experience with HW and SW fault detection and management
  • Experience with power supplies monitoring and sequencing
  • Proficient in the fundamentals of power electronics with special emphasis on multiphase power converters
  • Demonstrated ability to work with oscilloscopes, multi-meters, current probes, electronic loads, and protocol analyzers as well as adept at intuitive reasoning to explain and correct unexpected results
  • Proficient at documenting experimental results in a structured manner for ease of reference
  • Knowledge of computer hardware (CPU/APU, graphic cards, memory, bus logic, and display technologies) and software architecture (driver/bios)
  • Ability to set up hardware and build computer systems
  • Logic based decision making on triage/debug
  • Test automation expertise using Perl/Tcl/Python/LabView
  • Bonus skills: Familiarity with the low-speed industry standard protocols such as I2C, Redfish, IPMI, SPI, I3C, SVI2, SVI3 and for the high-speed industry standard protocols such as PCIe Gen 5.

ACADEMIC CREDENTIALS:

  • Bachelor's Degree in Electrical or Computer Engineering Preferred

About the company

At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career., AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process. AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD's "Responsible AI Policy" is available here. This posting is for an existing vacancy. You must create an Indeed account before continuing to the company website to apply

Apply for this position