HPC Engineer

CommonAI C.I.C.
Cambridge, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Cambridge, United Kingdom

Tech stack

Microsoft Excel
Artificial Intelligence
Data analysis
Profiling
Nvidia CUDA
Python
NumPy
Open Source Technology
Prometheus
Graphics Processing Unit (GPU)
PyTorch
Large Language Models
Grafana
Deep Learning
Caching
Pandas
Information Technology

Job description

We are seeking a Performance Engineer to join our rapidly growing team. In this role, you will work with AI researchers and software engineers to build up a detailed understanding of how their applications are performing. You will instrument and collect granular metrics from inference and training jobs and use that information to develop sophisticated mathematical models that predict how software optimisations and architectural or hardware changes will impact system performance.

Your work will directly influence both our in-house and member's hardware purchasing decisions and architectural optimisations, ensuring teams can run AI workloads efficiently and cost-effectively.

Requirements

Do you have experience in Statistical analysis?, Do you have a Bachelor's degree?, This role requires a degree in computer science, mathematics or an adjacent field. You should also be able to demonstrate:

  • Experience building insightful mathematical models and performance calculators (Excel/Google Sheets or Python modeling experience) to forecast system behavior.
  • Optimisation of code running on GPUs and/or other accelerators (e.g. CUDA).
  • Solid understanding of computer architecture fundamentals and how LLMs and Deep Learning models execute on that hardware (inference vs. training, matrix multiplication, KV-caching, etc.).
  • Proficiency with profiling tools (NVIDIA Nsight, PyTorch Profiler) and monitoring stacks (Prometheus, Grafana).
  • Capability to work in Python for data analysis (Pandas, NumPy) and scripting.

The following are also highly valued:

  • Post-graduate degrees and research experience in relevant fields (please list your publications).
  • Deep understanding of inference serving frameworks (e.g. vLLM).
  • Background in statistical analysis.
  • Contributions to open source and/or research projects.

Benefits & conditions

  • A collaborative and supportive work environment
  • The opportunity to have a high impact in a growing organisation
  • Competitive salary package and pension
  • Professional development opportunities
  • Networking opportunities with influential people from across the tech sector and academia
  • A vibrant office environment located a few minutes' walk away from Cambridge train station

About the company

CommonAI CIC is a non-profit membership organisation, founded on a belief in collaborative engineering for the safe and responsible development of foundational AI technologies. A place where AI startups, enterprises large and small, public sector bodies and academia can share resources and knowledge, to codevelop and grow businesses, fast.

Apply for this position