Algorithm Software Architecture
Role details
Job location
Tech stack
Job description
We are seeking a Senior Algorithm Software Architect to lead design and delivery of GPUaccelerated, highperformance computing software. You will set architectural direction, coach engineers, and partner with product and domain experts to deliver scalable, reliable systems for largescale compute and data workflows., * Own the endtoend software architecture for HPC/GPU platforms (services, libraries, data pipelines, deployment)
- Lead technical strategy, decision records, define candidate architectures, lead design reviews and drive decisions; drive clear tradeoffs for performance, reliability, and maintainability
- Design and implement GPU kernels and frameworks (e.g., CUDA, OpenCL, NCCL), optimizing for throughput, latency, and memory use
- Guide parallel and distributed computing patterns (MPI, multiGPU scaling, heterogeneous compute)
- Establish performance engineering practices: profiling, benchmarking, regression performance gates (Nsight Systems/Compute, nvprof)
- Collaborate across functions; convert requirements into clear technical plans, roadmaps, and measurable outcomes
- Uphold engineering excellence: coding standards, code reviews, test strategies, observability, security considerations
- Mentor engineers; provide technical leadership on design, delivery, and career growth.
- Communicate architecture, risks, and status to executives and stakeholders with clarity and candor.
Requirements
- 10+ years in software engineering; 5+ years in software architecture for HPC or largescale systems
- Expert in C++ (17/20) and one scripting language (Python preferred)
- GPU programming expertise (CUDA, OpenCL); strong knowledge of GPU memory hierarchies, streams, occupancy
- Handson with parallel/distributed stacks (MPI, NCCL, gRPC) and Linux performance tooling
- Experience with cluster orchestration (Slurm, Kubernetes), CI/CD, and containerization (Docker)
- Track record of technical leadership and exceptional communication with crossfunctional teams.
Preferred / NicetoHave
- Multinode, multiGPU scaling; mixed precision; numerical methods and algorithms.
- Experience with H200/H100/A100/L40Sclass accelerators and modern profiling workflows., Doctorate (Academic) Degree and related work experience of 3 years; Master's Level Degree and related work experience of 6 years; Bachelor's Level Degree and related work experience of 8 years
Benefits & conditions
Base Pay Range: $159,500.00 - $271,200.00 Annually
Primary Location: USA-CA-Milpitas-KLA
KLA's total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.
Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills, experience, and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable, your recruiter can share more about the specific pay range for your preferred location during the hiring process.