Senior Devops / Platform Engineer

Strategy Big Data

Ferrol, Spain

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Ferrol, Spain

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Bash

Software as a Service

Cloud Computing

Computer Programming

Continuous Integration

DevOps

Elasticsearch

Python

MongoDB

Open Source Technology

Prometheus

Scripting (Bash/Python/Go/Ruby)

Cloud Platform System

Grafana

Kubernetes

Terraform

Docker

Job description

We're looking for a Senior DevOps / Platform Engineer to help design, automate, and operate our cloud-native platform. Role

You'll work across AWS and GCP, manage Kubernetes at scale, implement highly-automated CI/CD workflows, and collaborate with engineering teams to ensure reliable delivery of SaaS features and AI-driven products.

Real ownership and autonomy - you'll be a key technical decision-maker
Work directly with leadership on platform strategy
Hands-on with cutting-edge cloud-native and AI/ML workloads
Opportunity to lead a major AWS / GCP migration to optimize costs and performance, Infrastructure & Cloud
Design, build, and maintain multi-cloud infrastructure on AWS and GCP
Operate and optimize Kubernetes clusters (GKE, EKS) at scale (up to ~1 K nodes)
Lead infrastructure modernization and cloud migration initiatives
Implement cost optimization strategies across cloud providers

Automation & CI/CD

Manage Argo Workflows and ArgoCD for GitOps-based deployments
Build and maintain end-to-end Infrastructure as Code with Terraform (modularized, reusable, multi-cloud)
Develop internal automation tooling and scripts (Python, Bash, Go)
Implement zero-downtime deployment strategies

Platform Services

Deploy and manage production MongoDB, ElasticSearch, and other core services
Package and deploy workloads using Helm, Docker, and GitOps pipelines
Ensure 99%+ uptime SLA through robust monitoring and incident response
Support delivery of AI containerized solutions ready for customer environments

Reliability & Observability

Build comprehensive observability across all platform components
Implement security best practices and compliance requirements
Drive post-incident reviews and continuous improvement

Requirements

Must Have

5+ years as a DevOps, SRE, or Platform Engineer in production environments
Strong hands-on Kubernetes experience (GKE and/or EKS) managing clusters at scale
Expert-level Terraform and Infrastructure as Code workflows
Multi-cloud experience with both AWS and GCP
Proven experience with CI/CD, GitOps, ArgoCD, Argo Workflows
Solid Docker and Helm expertise for containerized deployments
Strong scripting/programming skills in Python and Bash
Experience running production-grade, scalable, and secure cloud systems
Comfortable with incident response and on-call responsibilities

Nice to Have

Programming for tooling development (Python, bash, Go, ...)
Experience with observability stacks (Prometheus, Grafana, Elastic, OpenTelemetry)
Hands-on with AI/ML workloads in containerized environments
MongoDB and ElasticSearch operations at scale
Experience with cost optimization strategies in cloud environments
Contributions to open-source DevOps/platform projects
AWS/GCP certifications

Benefits & conditions

Competitive salary package
Fully remote work with flexible hours
23 days of vacation + Spanish public holidays

Growth & Impact

Real ownership - your decisions shape the platform's future
Work directly with leadership on technical strategy
Continuous learning with modern cloud-native, DevOps, and AI tooling
Opportunity to mentor and grow the team as we scale
Visible impact on products used by enterprise customers

Work Culture

Engineering-driven culture that values automation and best practices
Async-first communication (we respect work-life balance)
Blameless post-mortems and learning from incidents
Regular team knowledge-sharing sessions and open cooperation

About the company

We build and operate a fully-automated Speech Analytics SaaS platform running on Kubernetes across AWS and GCP. Our infrastructure processes ~160,000 hours of audio monthly with 99%+ uptime SLA, serving enterprise customers with mission-critical analytics needs. Our platform is built on modern, cloud-native technology: Kubernetes, Argo ecosystem, MongoDB, ElasticSearch, and 100% Terraform-driven Infrastructure as Code. We auto-scale from dozens to over 1,000 Kubernetes nodes based on demand. Beyond our core SaaS product, we deliver managed solutions (Autopilot and Copilot platforms) and build AI-based services packaged as containerized, Terraform-ready modules for seamless integration into customer cloud environments (AWS, GCP, Azure). We're a team that values strong engineering practices, automation-first mindset, and operational excellence.