Software Development Engineer, AWS Resilience, Health Guardian

Amazon.com, Inc.
Seattle, United States of America
yesterday

Role details

Contract type
Internship / Graduate position
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 194K

Job location

Seattle, United States of America

Tech stack

Amazon Web Services (AWS)
Code Review
Computer Programming
Software Design Patterns
Distributed Systems
Software Engineering
Information Technology
Data Analytics
Build Process
Software Coding
Software Version Control

Job description

The HealthGuardian team is looking for a software engineer who is excited about building automated detection and mitigation systems that protect AWS infrastructure at scale. We detect subtle failures that evade traditional health checks and automatically remove affected resources from service before customers are impacted. Our systems run across every AWS region, and we're scaling coverage from hundreds of services to thousands. This is a hands-on position where you will design and deliver significant software components, drive cross-team technical alignment, and mentor other engineers. You need to be a strong software developer with a track record of delivering, but also excel in communication, technical leadership, and customer focus. You'll leverage generative AI tools as part of your daily workflow to accelerate design, development, and validation. This is an opportunity to join a small, high-impact team solving hard reliability problems and help shape both the technology and the direction of automated failure protection across AWS.

Key job responsibilities Our engineers collaborate across diverse teams, projects, and environments to have a firsthand impact on AWS reliability. You'll bring a passion for distributed systems, safety engineering, and data-driven detection. You'll also: Design and deliver systems that span multiple AWS teams and organizational boundaries. Build detection algorithms and experimentation frameworks that validate changes at scale. Architect safety mechanisms - circuit breakers, throttling, validation - that let automation scale without unintended customer impact. Own ambiguous problems end-to-end from design through operations. Mentor other engineers and lead technical design reviews. Use AI-assisted development tools to prototype, test, and validate faster.

About the team We are a small team with outsized impact on AWS reliability. We operate what we build, and every engineer has direct visibility into how their code performs during real infrastructure events. We solve complex distributed systems challenges to ensure automated protection works reliably even during the failures it's designed to detect. We value operational rigor, building systems that are safe by default, and solving hard problems with simple designs.

Requirements

3+ years of non-internship professional software development experience

  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 2+ years of programming with at least one software programming language experience, 2+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience in mentoring, leading, or managing more junior engineers

Benefits & conditions

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, WA, Seattle - 143,700.00 - 194,400.00 USD annually

About the company

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain, and we're looking for talented people who want to help. You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You'll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

Apply for this position