Site Reliability Engineer (Enterprise Platform)
Role details
Job location
Tech stack
Job description
Our client is seeking a Senior Site Reliability Engineer (Azure) to architect and scale a robust infrastructure foundation for a high-growth distributed systems platform. This position is critical for ensuring that the platform operates as a secure, scalable, and production-ready environment capable of supporting complex enterprise use cases and high reliability standards.
The successful candidate will take a lead role in designing infrastructure from first principles, bridging the gap between product requirements and technical execution. This is a high-impact opportunity for a seasoned engineer to build greenfield Azure environments and establish operational excellence across a global ecosystem., * Infrastructure Design: Architect and deploy secure, scalable Azure infrastructure tailored for production-grade distributed systems
- Automation & IaC: Develop and maintain Terraform-based infrastructure as code to enable repeatable, automated deployments across various environments
- Technical Leadership: Translate ambiguous product and customer requirements into structured technical architecture and actionable execution plans
- Platform Enhancement: Build and optimize platform services, APIs, and integrations to extend core system capabilities
- Cross-Functional Collaboration: Partner with engineering, security, and product teams to deliver enterprise-ready infrastructure solutions
- Operational Excellence: Drive improvements in reliability, observability, and incident response while providing Tier 2 infrastructure support for customer deployments
Requirements
- Proven Track Record: Extensive experience designing and building production-grade systems specifically on the Azure stack
- Problem Solving: Ability to transform high-level requirements into scalable, delivered systems
- Communication: Strong technical communication skills with the ability to interface with both engineering teams and non-technical stakeholders
- Mindset: A high-ownership approach with a strong bias for action and accountability, * Azure Services: Deep knowledge of Azure networking, compute, identity, security, and storage
- Infrastructure as Code: Advanced proficiency with Terraform at production scale
- Programming: Professional experience in Go and/or Python
- Systems Engineering: Background in distributed systems, high-availability architectures, or platform engineering
- CI/CD: Experience with automation tooling for the entire infrastructure lifecycle, * Hands-on experience with Kubernetes and container orchestration
- Familiarity with observability tools such as Prometheus and Grafana
- Experience with workflow/orchestration platforms like Argo or Spacelift
Benefits & conditions
In the first 6-12 months of this role, the following milestones are expected to be achieved:
- Production Readiness: The Azure environment is established as a fully production-ready deployment setting for the platform
- Scalable Deployments: All customer deployments are verified as repeatable, scalable, and secure
- Feature Parity: Azure achieves full feature parity with all other supported cloud environments within the organization's ecosystem
Interview Process
- Recruiter & Technical Screening: Initial HR call followed by an introductory technical interview covering foundational questions
- Hiring Manager Interview: A deeper dive into experience, alignment, and role-specific expectations
- Technical Interview: A comprehensive evaluation of architectural and technical execution skills
- Final Leadership Interview: A concluding session with the VP of Engineering, Our client offers a competitive compensation package designed to reward high-impact contributors, including:
- Equity & Tokens: Participation in the long-term growth of the project
- Performance Bonuses: Annual incentives based on individual and company milestones
- Health & Retirement: Comprehensive health insurance and 401k plans (available for US-based employees)