Software Engineer II - Site Reliability Engineering

Electronic Arts Inc
Austin, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Austin, United States of America

Tech stack

Java
Amazon Web Services (AWS)
Cloud Computing
Fault Tolerance
Network Layer
Data Logging
System Availability
Delivery Pipeline
Spring-boot
Reliability of Systems
Kubernetes
Docker
Microservices

Job description

The IT Player Experience Engineering team builds and operates platforms that support millions of players worldwide. As a Software Engineer II - SRE, you will focus on improving the reliability, scalability, and operational excellence of Java-based, microservices-driven systems that power player experiences. This role is critical to delivering FY26 goals by embedding SRE best practices across design, development, and operations., * Drive SRE initiatives to improve system availability, performance, and resilience across Java microservices

  • Define and track SLOs, SLIs, and error budgets for critical services
  • Lead incident response, root cause analysis (RCA), and postmortems to prevent recurrence
  • Automate operational tasks to reduce toil and improve system reliability

Observability

  • Design and implement monitoring, alerting, and logging strategies using industry-standard tools
  • Build end-to-end observability with metrics, distributed tracing, and logs for microservices
  • Tune alerts to reduce noise and ensure actionable signal during incidents

Engineering & Platform Enablement

  • Collaborate with development teams to build reliability into Java/Spring Boot services from design through production
  • Review service architecture for scalability, fault tolerance, and operability
  • Improve CI/CD pipelines with reliability, testing, and deployment safety checks
  • Support cloud-native deployments on AWS and containerized platforms (Docker/Kubernetes)

Best Practices & Enablement

  • Champion SRE best practices including automation, capacity planning, and resiliency testing
  • Contribute to runbooks, operational documentation, and knowledge sharing
  • Partner with engineers, product managers, and leadership to balance feature velocity with system reliability

Requirements

Core Skills

  • Strong experience with Java, Spring Boot, and microservices architectures
  • Hands-on experience with monitoring, alerting, logging, and distributed tracing
  • Experience supporting production systems with high availability and scale requirements

Cloud & Infrastructure

  • Experience with AWS services and cloud-native architectures
  • Familiarity with Docker, Kubernetes, and CI/CD pipelines

Reliability Mindset

  • Experience with incident management, on-call rotations, and post-incident analysis
  • Strong troubleshooting skills across application, infrastructure, and network layers

Collaboration

  • Ability to work closely with application engineers to influence design for reliability
  • Clear communication skills to explain operational risks and trade-offs

Benefits & conditions

We're proud to have an extensive portfolio of games and experiences, locations around the world, and opportunities across EA. We value adaptability, resilience, creativity, and curiosity. From leadership that brings out your potential, to creating space for learning and experimenting, we empower you to do great work and pursue opportunities for growth.

We adopt a holistic approach to our benefits programs, emphasizing physical, emotional, financial, career, and community wellness to support a balanced life. Our packages are tailored to meet local needs and may include healthcare coverage, mental well-being support, retirement savings, paid time off, family leaves, complimentary games, and more. We nurture environments where our teams can always bring their best to what they do.

About the company

Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world. Here, everyone is part of the story. Part of a community that connects across the globe. A place where creativity thrives, new perspectives are invited, and ideas matter. A team where everyone makes play happen.

Apply for this position