Senior Devops / Platform Engineer
Strategy Big Data
Ferrol, Spain
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Ferrol, Spain
Tech stack
Artificial Intelligence
Amazon Web Services (AWS)
Bash
Software as a Service
Cloud Computing
Computer Programming
Continuous Integration
DevOps
Elasticsearch
Python
MongoDB
Open Source Technology
Prometheus
Scripting (Bash/Python/Go/Ruby)
Cloud Platform System
Grafana
Kubernetes
Terraform
Docker
Go
Job description
We're looking for a Senior DevOps / Platform Engineer to help design, automate, and operate our cloud-native platform. Role
You'll work across AWS and GCP, manage Kubernetes at scale, implement highly-automated CI/CD workflows, and collaborate with engineering teams to ensure reliable delivery of SaaS features and AI-driven products.
- Real ownership and autonomy - you'll be a key technical decision-maker
- Work directly with leadership on platform strategy
- Hands-on with cutting-edge cloud-native and AI/ML workloads
- Opportunity to lead a major AWS / GCP migration to optimize costs and performance, Infrastructure & Cloud
- Design, build, and maintain multi-cloud infrastructure on AWS and GCP
- Operate and optimize Kubernetes clusters (GKE, EKS) at scale (up to ~1 K nodes)
- Lead infrastructure modernization and cloud migration initiatives
- Implement cost optimization strategies across cloud providers
Automation & CI/CD
- Manage Argo Workflows and ArgoCD for GitOps-based deployments
- Build and maintain end-to-end Infrastructure as Code with Terraform (modularized, reusable, multi-cloud)
- Develop internal automation tooling and scripts (Python, Bash, Go)
- Implement zero-downtime deployment strategies
Platform Services
- Deploy and manage production MongoDB, ElasticSearch, and other core services
- Package and deploy workloads using Helm, Docker, and GitOps pipelines
- Ensure 99%+ uptime SLA through robust monitoring and incident response
- Support delivery of AI containerized solutions ready for customer environments
Reliability & Observability
- Build comprehensive observability across all platform components
- Implement security best practices and compliance requirements
- Drive post-incident reviews and continuous improvement
Requirements
Must Have
- 5+ years as a DevOps, SRE, or Platform Engineer in production environments
- Strong hands-on Kubernetes experience (GKE and/or EKS) managing clusters at scale
- Expert-level Terraform and Infrastructure as Code workflows
- Multi-cloud experience with both AWS and GCP
- Proven experience with CI/CD, GitOps, ArgoCD, Argo Workflows
- Solid Docker and Helm expertise for containerized deployments
- Strong scripting/programming skills in Python and Bash
- Experience running production-grade, scalable, and secure cloud systems
- Comfortable with incident response and on-call responsibilities
Nice to Have
- Programming for tooling development (Python, bash, Go, ...)
- Experience with observability stacks (Prometheus, Grafana, Elastic, OpenTelemetry)
- Hands-on with AI/ML workloads in containerized environments
- MongoDB and ElasticSearch operations at scale
- Experience with cost optimization strategies in cloud environments
- Contributions to open-source DevOps/platform projects
- AWS/GCP certifications
Benefits & conditions
- Competitive salary package
- Fully remote work with flexible hours
- 23 days of vacation + Spanish public holidays
Growth & Impact
- Real ownership - your decisions shape the platform's future
- Work directly with leadership on technical strategy
- Continuous learning with modern cloud-native, DevOps, and AI tooling
- Opportunity to mentor and grow the team as we scale
- Visible impact on products used by enterprise customers
Work Culture
- Engineering-driven culture that values automation and best practices
- Async-first communication (we respect work-life balance)
- Blameless post-mortems and learning from incidents
- Regular team knowledge-sharing sessions and open cooperation
About the company
We build and operate a fully-automated Speech Analytics SaaS platform running on Kubernetes across AWS and GCP.
Our infrastructure processes ~160,000 hours of audio monthly with 99%+ uptime SLA, serving enterprise customers with mission-critical analytics needs.
Our platform is built on modern, cloud-native technology: Kubernetes, Argo ecosystem, MongoDB, ElasticSearch, and 100% Terraform-driven Infrastructure as Code.
We auto-scale from dozens to over 1,000 Kubernetes nodes based on demand.
Beyond our core SaaS product, we deliver managed solutions (Autopilot and Copilot platforms) and build AI-based services packaged as containerized, Terraform-ready modules for seamless integration into customer cloud environments (AWS, GCP, Azure).
We're a team that values strong engineering practices, automation-first mindset, and operational excellence.