AI Performance Engineer - GPU

Advanced Micro Devices

Amsterdam, Netherlands

27 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Amsterdam, Netherlands

Tech stack

Artificial Intelligence

Assembly Language

C++

Code Generation

Profiling

Nvidia CUDA

Computer Programming

Software Debugging

Machine Learning

OpenCL

Performance Tuning

Software Engineering

System Programming

Graphics Processing Unit (GPU)

PyTorch

Gpu Programming

Kubernetes

Information Technology

Slurm

Docker

Job description

AI Performance Engineers focus on pushing machine learning workloads to peak hardware efficiency, with emphasis on low-level optimization and kernel performance. As an AI Performance Engineer, you will:

Analyze and explore recent ML models and workloads, understand their compute, memory, and instruction-level behavior, and optimize them on AMD GPUs for both inference and training
Design, implement, and tune GPU kernels at a low level, including C++, intrinsics, and hand-written GPU assembly
Perform deep profiling and bottleneck analysis across compute, memory, and execution pipelines
Optimize instruction scheduling, memory access patterns, and occupancy to achieve the final 5% of performance uplift
Work closely with hardware, compiler, and software teams to drive performance improvements across the full stack
The position is part of an AI and GPU performance optimization workstream at AMD
Collaborate with AI developers, compiler engineers, and hardware architects to understand performance limits and opportunities
Work with multiple teams located locally in Finland and the UK, as well as internationally
Communicate performance bottlenecks, solutions, and optimization strategies clearly across teams
Benchmark, analyze, and optimize performance of key machine learning workloads on single- and multi-GPU AMD systems
Design, implement, and tune high-performance GPU kernels for tensor operations such as matrix multiplication, attention, and convolutions
Apply instruction-level and memory-level optimizations to achieve measurable performance improvements
Deliver high-quality, well-documented, production-ready performance-critical code
Low-level performance optimization
GPU programming and hardware architecture
Instruction scheduling, memory hierarchies, and execution pipelines
Extracting the maximum achievable performance from hardware under real-world constraints
Passionate about getting the best out of the hardware and motivated by the challenge of delivering that extra 5% performance uplift

Requirements

BSc, MSc, PhD, or equivalent experience in Computer Science, Engineering, Physics, * Strong understanding of GPU architectures and low-level optimization techniques, including memory hierarchy, instruction scheduling, and performance tradeoffs

GPU software development using HIP, CUDA, or OpenCL
Strong C++ skills for performance-critical systems programming
Experience with profiling, debugging, benchmarking, and performance analysis tools
Experience in HPC or performance-critical systems
Familiarity with modern ML frameworks (e.g., PyTorch, MIOpen)
Familiarity with tile programming and related frameworks (Triton, Cutlass, etc.)
Strong written and spoken English
Experience with Docker, Singularity, Slurm, or Kubernetes is a plus
BSc, MSc, PhD, or equivalent experience in Computer Science, Engineering, Physics, or a related technical field

How to stand out from the crowd:

Experience with GPU assembly programming and/or compiler backends
Experience with CPU assembly (x86 or Arm) and microarchitectural optimization
Background in compilers, code generation, or instruction-level optimization

About the company

At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.