Platform Engineer gesucht in Munich
Hier Ihre Firma Anmelden
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Intermediate Compensation
€ 150KJob location
Tech stack
Amazon Web Services (AWS)
Continuous Integration
Data as a Services
DevOps
Github
Identity and Access Management
Performance Tuning
Reliability Engineering
Prometheus
Management of Software Versions
Data Logging
Grafana
Mttr
Kubernetes
Amazon Web Services (AWS)
Terraform
Static Application Security Testing
Dynamic Application Security Testing
Job description
- Design and maintain CI/CD pipelines on GitHub that are fast, repeatable, and developer-friendly (clear feedback loops, safe deploys, strong defaults).
- Define and operate infrastructure using Terraform - with clean modules, sensible standards, and automated validation.
- Improve developer experience through golden paths: templates, self-service environments, paved roads for deployments, and internal tooling that removes friction.
- Drive availability, scalability, and resilience: deployment strategies, rollbacks, capacity planning, DR thinking, and performance tuning.
- Implement pragmatic security-by-default: least privilege IAM, secrets management, secure supply chain, and guardrails that enable speed without compromising safety.
- Establish and refine observability and reliability practices (SLOs/SLIs, monitoring, alerting, postmortems, runbooks) that scale with the team.
- Partner closely with product engineering to reduce operational load and keep delivery velocity high as Zalion grows., * AWS (core services; compute, networking, IAM, logging/monitoring, managed data services)
- Terraform (modules, workspaces, validation, state management)
- GitHub (Actions, CI/CD workflows, checks, release automation)
- Containers orchestration (e.g., ECS/Fargate and/or Kubernetes depending on evolution)
- Observability tooling (metrics, logs, tracing; e.g., Grafana/Prometheus/OpenTelemetry and friends)
- Security tooling (SAST/DAST, dependency scanning, secrets scanning, policy as code
Requirements
- Strong experience as a Platform / DevOps / Site Reliability Engineer in product teams shipping to production.
- Deep practical knowledge of AWS: networking, IAM, security controls, and designing for failure.
- Hands-on expertise with Terraform: modules, state strategy, DRY patterns, environment separation, and automated reviews.
- Solid CI/CD engineering experience with GitHub: pipeline design, artifact/versioning, deployment safety, and fast feedback loops.
- A strong mindset for reliability and operability: you think in failure modes, automation, and measurable outcomes (SLOs).
- Security awareness and discipline: you build guardrails that make the secure path the easy path.
- A builder mindset: you ship improvements, measure impact (lead time, deploy frequency, MTTR), and iterate.
- Comfort with ambiguity and ownership: you proactively identify platform bottlenecks and fix them without waiting for perfect specs.
- 4+ years experience in relevant roles (startup/scale-up experience is a plus).
Benefits & conditions
- Build the platform behind agentic AI systems that run in real enterprise environments
- Massive autonomy, zero bureaucracy
- Immediate impact - your work accelerates every engineer and every release
- Modern stack, no legacy constraints
- Competitive salary + meaningful equity
- High-end equipment