Senior HPC Performance Engineer

NVIDIA Ltd.
Remote, United States of America
13 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 288K

Job location

Remote, United States of America

Tech stack

Assembly Language
C++
Nvidia CUDA
Computer Programming
Software Debugging
Fortran
OpenMP
Software Engineering
Parallel Computation
Information Technology

Requirements

  • BS/MS or equivalent experience in Computer Science or related engineering field.

  • 8+ Years of programming experience.

  • Solid understanding of Fortran/C/C++, as well as programming techniques, especially for parallel architectures; preferably for compilers

  • Experience with OpenACC, OpenMP, MPI, and CUDA.

  • Strong skills in performance analysis and tuning, as well as a broad understanding of parallel applications development tools and runtime environments.

  • Strong mathematical fundamentals, including linear algebra and numerical methods.

  • Understand performance considerations, tradeoffs and impact.

  • Expert interpersonal skills, logical approach to problem solving, good time management and task prioritization skills. Excellent written and verbal communication skills, along with the ability to work in a dynamic product oriented team.

Ways to stand out from the crowd:

  • You have a deep understanding of machine architectures and micro-architectures.

  • Experience with debugging and porting as well as assembly language programming is a significant advantage.

  • Experience is leading and/or managing projects is a plus.

Benefits & conditions

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD.

About the company

As a member of our team in NVIDIA's NVHPC compilers & tools group, you will analyze and run High Performance Computing (HPC) applications on HPC servers and systems to gain insight into the performance characteristics of these applications. The applications you'll work with range from small synthetic benchmarks that use a single core to full applications that utilize all of the resources on distributed-memory systems with heterogeneous compute nodes including CPUs, GPUs and many-core processors. In this role you will analyze these applications and identify optimization opportunities for compiler development teams and application engineering teams.

Apply for this position