Senior DevOps / Infrastructure Engineer

Causa Prima
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Tech stack

API
Cloud Computing
Code Review
Databases
Continuous Integration
Data Stores
DevOps
Fault Tolerance
Github
Graph Database
Identity and Access Management
Python
Network Security
Neo4j
OAuth
Ansible
TypeScript
Data Logging
Pulumi
Cloud Platform System
Cloud Monitoring
Large Language Models
Amazon Web Services (AWS)
Kubernetes
Sentry
Kafka
Event Store
Terraform
Docker
Pagerduty

Job description

  • CI/CD - GitHub Actions + Cloud Build, security-aware pipeline design, production approval gates, container image scanning, secret isolation, signed commits.
  • Observability - OpenTelemetry distributed tracing across TypeScript and Python services, Cloud Monitoring, Sentry with PII-stripping hooks, structured logging with sanitization, per-agent behavioural monitoring, tiered alerting.
  • Secret management & rotation - Credential lifecycle for LLM API keys, database credentials, OAuth tokens, and agent signing keys in GCP Secret Manager.
  • Container orchestration - Docker builds, registry management, GKE cluster configuration. Design the path toward Kubernetes-native deployment as we scale.
  • Incident response infrastructure - Per-agent circuit breakers, graceful degradation, tiered alerting (logged Slack PagerDuty), forensic tooling via event store replay and traces.
  • Network security - VPC firewall rules, private ingress for all data stores, egress controls, PII Vault on restricted-access infrastructure.
  • Neo4j Aura operations - Monitoring, scaling decisions, and backup verification for the managed graph database.

Requirements

Do you have experience in Terraform?, Do you have a Master's degree?, * 5+ years in DevOps, infrastructure, or SRE roles for production systems.

  • Strong systems design skills - you think in deployment topologies, failure domains, blast radius, and operational security.
  • Production experience with GCP (Cloud Run, GKE, Cloud SQL, IAM, Secret Manager) or equivalent cloud platform with willingness to go deep on GCP.
  • Hands-on experience with Kubernetes in production - cluster management, networking, scaling, security policies.
  • Experience with infrastructure-as-code: Terraform, Pulumi, Ansible, or similar. Ideally more than one.
  • Experience designing CI/CD pipelines with security in mind - secret isolation, approval gates, image scanning, deployment strategies.
  • Experience with observability systems - distributed tracing, structured logging, alerting hierarchies, dashboarding.
  • Security awareness at the infrastructure level - you think about network isolation, least-privilege IAM, and credential hygiene as defaults.
  • Strong code review skills for infrastructure-as-code and deployment configuration.
  • Nice to have:
  • Event streaming infrastructure (Kurrent, Redpanda, Kafka).
  • SOC 2 or GDPR compliance from an infrastructure perspective.
  • Fintech or regulated-environment background.

Apply for this position