Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
As a Senior Site Reliability Engineer (m/f/x) at SysEleven, you design, build, and operate APIs that power the automation and reliability of our as-a-Service products, such as Database as a Service. You use Infrastructure as Code to standardize and scale our platforms, and you continuously improve CI/CD pipelines to ensure secure, resilient, and efficient delivery processes. With GitOps practices and Kubernetes orchestration, you reduce operational complexity and enable stable, predictable deployments that support our customers' critical workloads. You take ownership of reliability end to end, contribute to a culture of continuous improvement, and lead by example in solving complex technical challenges that shape the future of our services.
Your tasks
- Ensure the reliability, availability, and performance of our Database- and Observability-as-a-Service products
- Manage container-based applications in Kubernetes with a strong focus on security and resilience
- Lead incident response, root cause analysis, and sustainable remediation efforts
- Apply GitOps principles using Helm and Argo CD
- Develop API services and tooling in Go to deliver stable SaaS products
- Build and optimize CI/CD pipelines to improve deployment safety and system stability
- Design and manage scalable infrastructure using IaC tools (e.g., Terraform) in cloud environments
Our Technologies and Tech Stack:
- Go, Python, Bash
- OpenStack, Kubernetes, Cilium, Envoy, Kyverno
- Terraform, Crossplane, Argo CD, GitLab CI
- PostgreSQL, Grafana, Loki, Mimir
Requirements
Do you have experience in Terraform?, * Several years of experience operating highly available systems in Linux and Kubernetes environments
- Strong understanding of observability concepts (monitoring, logging, tracing)
- Practical development experience in Go (knowledge of Python or Rust is a plus)
- Experience with Infrastructure-as-Code tools such as Terraform or OpenTofu
- Hands-on experience in incident management and structured root cause analysis
- Familiarity with CI systems, especially GitLab CI
- Strong problem-solving skills and good communication skills in German and English (minimum B2 level)