Senior Software Engineer, Infrastructure
Role details
Job location
Tech stack
Job description
Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. Anthropic is seeking talented and experienced Infrastructure Engineers to join our team and support the development, scaling, and maintenance of our cutting-edge AI systems. By joining our Infrastructure team, you will have the opportunity to work on groundbreaking AI technologies and contribute to the development of frontier models, supporting Anthropic's mission to create safe and reliable AI systems that benefit humanity. Anthropic's Infrastructure organization is the engine that powers our mission to develop AI systems that are safe, beneficial, and understandable. Every breakthrough in AI safety research and every interaction users have with Claude depends on the systems we build and operate: massive clusters for training, production infrastructure serving millions of users reliably, and developer platforms that help engineers move fast without breaking things. And even with that, this isn't typical infrastructure work. We're building at the frontier of what's possible, solving novel scaling challenges that few organizations face with a high degree of security, all in service of ensuring transformative AI benefits humanity. If you're energized by your technical work directly enabling some of the most important research happening today, Infrastructure at Anthropic is the best place to make a real difference. We have multiple teams that are currently hiring. Team placement occurs after the interview process, taking into account your interests and experience alongside organizational needs. This flexible approach allows us to match talented engineers with the infrastructure teams where they'll have the greatest impact and growth potential.
- Lead build out of industry-leading AI clusters (thousands to hundreds of thousands of machines), partnering closely with cloud service providers on cluster build out and required features
- Consult with different stakeholders to deeply understand infrastructure, data and compute needs, identifying potential solutions to support frontier research and product development
- Set technical strategy and oversee development of high scale, reliable infrastructure systems.
- Mentor top technical talent
- Design processes (e.g. postmortem review, incident response, on-call rotations) that help the team operate effectively and never fail the same way twice
Requirements
-
Have 6+ years of relevant industry experience, 1+ year leading large scale, complex projects or teams as an engineer or tech lead
-
Are obsessed with distributed systems at scale, infrastructure reliability, scalability, security, and continuous improvement
-
Strong proficiency in at least one programming language (e.g., Python, Rust, Go, Java)
-
Strong problem-solving skills and ability to work independently
-
Have a passion for supporting internal partners like research to understand their needs
-
Have excellent communication skills to build consensus with stakeholders, both internally and externally
-
Possess deep knowledge of modern cloud infrastructure including Kubernetes, Infrastructure as Code, AWS, and GCP
-
Security and privacy best practice expertise
-
Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL
-
Low level systems experience, for example linux kernel tuning and eBPF
-
Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems, Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience.
Benefits & conditions
The expected base compensation for this position is below. Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.
Annual Salary:
£240,000 - £325,000 GBP