Site Reliability Engineer
Role details
Job location
Tech stack
Job description
As a Site Reliability Engineer on the Application Edge team at Proton, you will help build, maintain, and scale the critical infrastructure that powers our privacy-focused services. We manage user-facing production infrastructure across both on-premises and cloud environments, ensuring the reliable, secure, and high-performance delivery of Proton's most critical traffic. Our team is at the forefront of private infrastructure technology, tackling global challenges in traffic flow management, load balancing, and application service control at scale.
We embrace SRE principles-automation, reliability, and performance-to develop and operate infrastructure that scales with Proton's growing user base. If you want to thrive in a high-impact role where infrastructure meets privacy, this is an opportunity to contribute to one of the most security-conscious and performance-critical environments in the industry.
What You Will Do:
- Design, build, and operate resilient application and traffic management systems for Proton's most critical production services.
- Develop and maintain automation tools to enhance deployment processes, configuration management, and service reliability.
- Ensure availability and performance of production services through load balancing, network optimization, and failure recovery strategies.
- Improve monitoring, alerting, and incident response to maintain high availability and rapid issue resolution.
- Optimize and scale Proton's hybrid infrastructure, spanning on-premises and cloud-based deployments.
- Collaborate with product and security teams to implement best practices for privacy and data protection at every layer of the stack.
- Document and share knowledge to foster a culture of learning and continuous improvement within the team and beyond.
Requirements
- Bachelor's degree in Computer Science, a similar field, or the equivalent practical experience.
- Strong knowledge of networking, including TCP/IP, DNS, and HTTP/HTTPS traffic management.
- Experience deploying, tuning, and maintaining high-performance load balancing solutions (e.g., HAProxy, Envoy, Traefik).
- Familiarity with Kubernetes and container orchestration in production environments.
- Experience with software development and automation in one or more programming languages (e.g., Python, Rust, Go).
- Ability to troubleshoot complex production issues across network, application, and system layers.
- A strong security mindset, with an understanding of secure infrastructure practices.
Bonus Points For:
- Hands-on experience with traffic shaping, rate limiting, and DDoS mitigation strategies.
- Deep expertise in Kubernetes service optimization and management.
- Experience with Infrastructure as Code (Terraform, Ansible, or similar).
- Familiarity with privacy-focused infrastructure challenges and solutions.