Senior CPU Performance Engineer
Role details
Job location
Tech stack
Job description
You will work on silicon platforms and real devices, analysing performance across complex and often non-deterministic environments. The role requires a solid foundation in CPU fundamentals, along with a strong interest in understanding how system-level behaviour maps to underlying microarchitecture.
Working alongside experienced architects and performance engineers, you will contribute to investigations, build supporting evidence, and help guide CPU design decisions.
This is not a CPU design or microarchitecture role. Instead, you will influence CPU architecture by identifying bottlenecks, validating hypotheses (where possible) using pre-silicon platforms (e.g. FPGA, emulation, performance models), and driving data-backed recommendations to CPU teams., * Analyse performance on real devices using real-world workloads
- Support investigation of system-level performance issues, identifying CPU-related factors
- Operate effectively in low-observability environments (e.g. no waveforms, partial counters, noisy systems)
CPU-Focused Analysis
- Use PMU counters and profiling tools to understand CPU behaviour
- Contribute to identifying performance bottlenecks (e.g. memory, front-end/back-end, execution)
- Build understanding of how microarchitectural features impact real-world performance
Pre-Silicon Support
- Assist with experiments on emulation or FPGA platforms to validate hypotheses
- Reproduce silicon-observed behaviours in controlled environments to isolate CPU effects
- Where applicable, compare silicon vs model vs emulation to build confidence in findings
Influencing CPU Architecture
- Contribute analysis and data to support discussions with CPU architecture teams
- Highlight gaps between real-world workloads and existing design assumptions
- Document findings and communicate insights clearly to stakeholders
Methodology & Tooling
- Use and progressively improve performance analysis workflows and tools
- Contribute to automation and data analysis where relevant
Requirements
- Strong understanding of CPU microarchitecture fundamentals (pipelines, caches, speculation, memory hierarchy)
- Experience in performance analysis (profiling, benchmarking, debugging)
- Familiarity with PMU-based profiling tools or equivalent performance counters
- Basic programming or scripting skills (Python, C/C++, or similar)
- Strong analytical and problem-solving skills
Preferred Experience
- Exposure to CPU performance analysis or benchmarking
- Familiarity with low-level programming (C / Assembly)
- Basic understanding of RTL or hardware design concepts
- Exposure to FPGA or emulation platforms
- Experience working on real devices (mobile, embedded, or client systems)
- Awareness of system-level performance interactions (OS, scheduling, memory)
- Familiarity with Arm architecture, * Interested in understanding how real-world workloads interact with CPU architecture
- Comfortable working with noisy, incomplete, or ambiguous data
- Eager to learn from and collaborate with experienced engineers
- Clear communicator, able to present analysis with appropriate guidance