Senior Devops / Platform Engineer
Role details
Job location
Tech stack
Job description
We build and operate a fully-automated Speech Analytics SaaS platform running on Kubernetes across AWS and GCP. Our infrastructure processes ~160,000 hours of audio monthly with 99%+ uptime SLA, serving enterprise customers with mission-critical analytics needs. Our platform is built on modern, cloud-native technology: Kubernetes, Argo ecosystem, MongoDB, ElasticSearch, and 100% Terraform-driven infrastructure as code. We auto-scale from dozens to over 1,000 Kubernetes nodes based on demand. Senior DevOps / Platform Engineer
You'll help design, automate, and operate our cloud-native platform. You'll work across AWS and GCP, manage Kubernetes at scale, implement highly automated CI/CD workflows, and collaborate with engineering teams to ensure reliable delivery of SaaS features and AI-driven products. What Makes This Role Unique
- Real ownership and autonomy - a key technical decision-maker
- Work directly with leadership on platform strategy
- Hands-on with cutting-edge cloud-native and AI/ML workloads
- Opportunity to lead a major AWS or GCP migration to optimize costs and performance, * Design, build, and maintain multi-cloud infrastructure on AWS and GCP
- Operate and optimize Kubernetes clusters (GKE, EKS) at scale (up to ~1,000 nodes)
- Lead infrastructure modernization and cloud migration initiatives
- Implement cost-optimization strategies across cloud providers
- Manage Argo Workflows and ArgoCD for GitOps-based deployments
- Build and maintain end-to-end infrastructure as code with Terraform (modularized, reusable, multi-cloud)
- Develop internal automation tooling and scripts in Python, Bash, and Go
- Implement zero-downtime deployment strategies
- Deploy and manage production MongoDB, ElasticSearch, and other core services
- Package and deploy workloads using Helm, Docker, and GitOps pipelines
- Ensure 99%+ uptime SLA through robust monitoring and incident response
- Support delivery of AI containerized solutions ready for customer environments
- Build comprehensive observability across all platform components
- Implement security best practices and compliance requirements
- Drive post-incident reviews and continuous improvement
Requirements
- 5+ years as a DevOps, SRE, or Platform Engineer in production environments
- Strong hands-on Kubernetes experience (GKE and/or EKS) managing clusters at scale
- Expert-level Terraform and infrastructure-as-code workflows
- Multi-cloud experience with both AWS and GCP
- Proven experience with CI/CD, GitOps, ArgoCD, and Argo Workflows
- Solid Docker and Helm expertise for containerized deployments
- Strong scripting/programming skills in Python and Bash
- Experience running production-grade, scalable, and secure cloud systems
- Comfortable with incident response and on-call responsibilities
- Nice to Have
- Programming for tooling development (Python, Bash, Go)
- Experience with observability stacks (Prometheus, Grafana, Elastic, OpenTelemetry)
- Hands-on with AI/ML workloads in containerized environments
- MongoDB and ElasticSearch operations at scale
- Experience with cost-optimization strategies in cloud environments
- Contributions to open-source DevOps/platform projects
- AWS/GCP certifications
Benefits & conditions
- Competitive salary package
- Fully remote work with flexible hours
- 23 days of vacation + Spanish public holidays
- Real ownership - your decisions shape the platform's future
- Work directly with leadership on technical strategy
- Continuous learning with modern cloud-native, DevOps, and AI tooling
- Opportunity to mentor and grow the team as we scale
- Visible impact on products used by enterprise customers
- Engineering-driven culture that values automation and best practices
- Async-first communication (we respect work-life balance)
- Blameless post-mortems and learning from incidents
- Regular team knowledge-sharing sessions and open cooperation