Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
The Senior Site Reliability Engineer role is focused on enhancing the reliability, scalability, and operational excellence of the platform. This position involves designing and implementing systems for improved observability and incident management, leading significant projects, and collaborating across various engineering teams to build robust platforms and services. The role is critical in establishing standards and driving reliability goals to ensure the platform meets high operational standards. Additionally, this position includes mentoring junior engineers and fostering continuous innovation to maintain and improve the organization''s engineering capabilities.
- Responsibilities:
-
Design and implement systems to improve reliability, observability, traceability, and incident management
-
Lead projects from discovery to execution, ensuring successful delivery
-
Collaborate with AI/ML, Data, Platform, and Product engineering teams to develop advanced platforms and services
-
Define and enforce production standards, processes, and tools for operational excellence
-
Advocate for and implement SLIs, SLOs, and other reliability metrics across engineering teams
-
Mentor junior team members to support technical growth and leadership development
-
Drive continuous improvement by introducing creative solutions and challenging existing processes
Requirements
-
5+ years of experience in Production Engineering, SRE, Platform Engineering, DevOps, Backend Engineering, or similar roles
-
Proficient coding skills in at least one language such as Golang, Python, Java, or Typescript
-
Experience with cloud-native technologies and Infrastructure-as-Code tools like Kubernetes, Terraform, and AWS
-
Proven track record delivering medium to large-scale projects that improve platform reliability and scalability
-
Strong understanding of production reliability concepts including SLIs, SLOs, and incident management
-
Skilled in designing and maintaining CI/CD pipelines, deployment strategies, and release automation
-
Familiarity with AI-assisted development tools such as Claude Code, Codex, or Cursor
-
Excellent communication skills for collaborating with technical and non-technical teams
-
Experience working in dynamic, reliability-focused production environments preferred
Benefits & conditions
- Pay Range and Compensation Package:
- The US base salary range for this full-time position is $220,000 - $250,000 annually plus equity and benefits