Staff Platform/DevOps Engineer
Role details
Job location
Tech stack
Job description
We're looking for a Staff Platform/DevOps Engineer to join forces with our DevOps Lead and push our platform's delivery, automation, and observability across diverse deployment environments to the next level. We build B2B software for industrial customers. Reliability, traceability, and maintainability matter across the full lifecycle.
Activities
You will work closely with engineering to design and operate the pipelines, runtimes, and tooling that connect development to production. You will take ownership of the CI/CD architecture, defining how code moves from commit to production safely and consistently. That includes managing the pipelines, shaping how and when to apply GitOps workflows and principles, and setting clear guardrails for rollback and change control.
You will also lead the observability program: operating and improving our current stack, defining what good telemetry means, and ensuring every service has actionable visibility and meaningful SLOs. As the architecture evolves, you'll help evaluate and introduce modern tools or practices that improve delivery speed, reliability, or traceability-always with a clear understanding of operational constraints and trade-offs.
What you will be doing
- Build and operate CI/CD pipelines using GitHub Actions, Terraform, and Ansible
- Integrate automation, validation, and observability, ensuring the full software lifecycle
- Define and apply GitOps standards for repository topology, environment promotion, secrets/drift handling
- Operate and enhance the runtime stack, choose and implement rollout strategies like blue/green, canary, and feature flags
- Collaborate with engineering teams to embed logging and metrics by owning Fluentd, Loki, and Grafana, ensure structured logs, meaningful metrics, tuned alerting, and actionable SLOs
- Drive platform evolution by evaluating and adopting modern DevOps tools and practices
- Improve secrets handling, image hygiene/SBOMs, backup and restore procedures, and baseline hardening
- Troubleshoot and resolve issues with well-tested updates, introduce new tools only when they meet real operational needs
Requirements
- Solid experience with CI/CD tooling, Git-based workflows
- Strong coding skills in Python and Bash (PowerShell is a plus)
- Strong understanding of GitOps concepts
- Good understanding of Kubernetes, container, and Linux
- Hands-on experience with Fluentd, Loki, and Grafana
- Proven troubleshooting and problem-solving skills
- Communication and collaboration skills
Nice To Haves
- Experience with air-gapped delivery (registry/Helm/OCI mirrors, artifact signing)
- Cloud exposure (especially Azure or GCP) and hybrid networking/VPN/S2S
- Experience with modern or emerging infra/deployment tools (Pulumi, OpenTelemetry, etc.)
- Postgres HA/backup tooling and restore drills.Exposure to MLOps, data-heavy pipelines, or infra for AI systems. German skills are a plus