Site Reliability Engineer Sre - Compute H/F
Role details
Job location
Tech stack
Job description
Our growth is driving us to strengthen our Compute team to build, standardize, and enhance the reliability of the infrastructure hosting Scaleway products.
Your mission will be onboarding product teams and automating infrastructure deployment in order to ensure continuous improvement and maximum observability of our sovereign cloud platform.
YOUR FUTURE TEAM
We work in a collaborative and international environment where the diversity of Scalers, combined with a spirit of sharing, helps bring new projects to life every day, advancing our ambitions together.
You will be part of a team of 6 people, including your manager. The team is currently in a stabilization phase, onboarding multiple products, deploying new AZs (Availability Zone) with a strong dynamic focused on uniform practices and operational excellence. We use a mix of Scrum and Kanban methodologies, rotating product referents to avoid the "bus factor."
Manager information: Kevin de Poulpiquet
YOUR DAILY ROUTINE
- Build and standardize the infrastructure hosting Scaleway's product catalog.
- Manage the onboarding of various product teams onto the platform.
- Implement and optimize observability stacks (monitoring, alerting).
- Automate infrastructure using GitOps processes and Infrastructure as Code.
- Deploy product stacks across new geographic regions.
- Handle operational maintenance (MCO) and participate in a weekly on-call rotation (approx. 1 week/month including Weekends).
- Improve CI/CD pipelines and technical documentation.
- Collaborate closely with product teams to bridge the gap between development and infrastructure.
- Ensure security compliance, including secret management (e.g., Vault).
- Drive continuous improvement of existing systems and deployment workflows.
- You possess a natural curiosity for deconstructing complex system failures and genuinely enjoy the 'detective work' involved in deep-dive troubleshooting
- Your strength lies in your cross-functional mindset, allowing you to seamlessly bridge the gap between low-level system administration and modern cloud-native orchestration, Interview with your future manager, Kevin de Poulpiquet, to discuss the role and your expertise (45 min).
- Technical interview to assess your skills (1h).
- Final tour toour offices and meet your future colleagues.
Requirements
HARD SKILLS
- Senior experience in Systems Administration (ideally 7+ years of experience).
- Mastery of Kubernetes (K8s) and GitOps workflows (ArgoCD, FluxCD).
- Strong automation skills (Ansible, Salt, GitLab).
- Proficiency in Observability tools (Grafana, Thanos, Prometheus ).
- Knowledge of Networking and Security
- Experience with Infrastructure as Code
- Used to on-call rotations and SRE objectives (SLA/SLO notions).
SOFT SKILLS
- Active Listening: Understanding the needs of product teams.
- Pragmatism: Finding efficient and realistic solutions.
- Precision: High attention to detail in infrastructure management.
- Open-mindedness: Essential for team synergy and collaborative work.
- Continuous Improvement Mindset: Always looking for ways to optimize.