Staff System Reliability Engineer
Role details
Job location
Tech stack
Job description
Your passion for engineering systems is visible to your team, stakeholders and in everything you and your team delivers. You take engineering film production seriously, and that will help you make decisions that drive value every day. You are:
- A master problem solver. You take deep pride in your work and view each new problem as an opportunity for success; you approach challenges creatively, but methodically.
- Comfortable with ambiguity. You face change with a cool head and persevere even if you don't have all the details; you are comfortable moving between projects and facing uncertainty because risk and change motivates you to evolve and innovate.
- Technically astute. You rapidly learn new skills, grasping complicated technical information that often leaves others lost whether it's product, industry, or hard tech knowledge, you absorb it quickly and are ready for more.
- Goal oriented. You develop and actualize vision, mission, and strategy into a cohesive outcome through capable team management, tactical execution, and unrelenting focus on daily progress; you connect business and technology strategies to the software you deliver and can easily articulate the value created.
What You'll Do
- Build and operate high-quality production systems
- Design systems to enable rapid development, high availability, and clear observability
- Maintain and improve the reliability of services and infrastructure
- Troubleshoot and resolve performance and reliability issues across the stack, including cloud resources
- Collaborate with engineers to ensure services are designed to be cloud-native, scalable, and easily operated
- Create self-healing infrastructure-as-Code and automate everything
- Ensuring security best practice is at the forefront of all your technical solutions
Requirements
We're looking for passionate engineers who love learning new technologies at a rapid pace. You should be intimately familiar with large scale data center infrastructure as well as cloud environments, and as comfortable in the shell as you are in an IDE. Lean, agile, self-sufficient teams is how we operate. We value a cloud first approach where we develop infrastructure-as-code. Automation should always do the work. You will be a part of the team that provides cutting edge film making systems in the public cloud, focused on automation and infrastructure-as-code for all studios under the Disney umbrella., * Minimum of 7+ years of related work experience
- Strong written and verbal communication skills
- Comfortable working with public cloud service providers (e.g. AWS, Google, Azure)
- Strong knowledge in system management languages (e.g. Terraform, Ansible)
- Operating systems and systems management (e.g. Amazon Linux, Windows)
- Multiple scripting languages in your toolbox (e.g. Python, GO, Ruby, or Swift)
- Authentication tool-sets such as Active Directory, LDAP, Ping Identity
- Experience working with observability tools for optimal performance and 24/7 reliability (e.g. DataDog, New Relic, Grafana)
- Data center, network, and application architectures
- Systems Security (e.g. key management, encryption, vulnerability management)
- Containerization and Container PaaS offerings (e.g. Docker, Rancher, Kubernetes)
- Thorough knowledge of continuous integration tools (e.g. Jenkins, GitLab CI/CD)
Preferred Qualifications:
- Familiarity with agentic coding tools such as Claude Code, Cursor, or Github Copilot
- Experience working in media production environments
- Passionate to explore new technologies and constantly learn
- Exceptional analytical and problem-solving skills
- Virtual hosting technologies (e.g. VMWare, KVM)
- Smart, self-driven with a keen focus on exceptional delivery of innovative solutions
Required Education:
- Bachelor's degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience