DevOps, SRE Engineer or Platform Engineer

Community Of
Municipality of Madrid, Spain
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Municipality of Madrid, Spain

Tech stack

Kubernetes Security
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Cloud Computing
Computer Networks
Data Security
DevOps
Github
Identity and Access Management
Python
Key Management
Machine Learning
Octopus Deploy
Open Source Technology
Role-Based Access Control
Prometheus
Systems Integration
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Grafana
Spark
Gitlab-ci
Kubernetes
Machine Learning Operations
Cloudwatch
Terraform
Jenkins

Requirements

observability, while keeping costs in mind. Ensure standardized cross-studio access & security to enable timely data access and ingestion (AWS and Google Cloud). Enable the teams with different environments for testing new setups, tools, without disrupting the day-to-day operations of the team and production workflows. Track usage for all our deployed applications, and identify areas of improvement, making the best use of resources. Keep up with the relevant technologies, best practices, especially related to AI productivity tools, continuously emerging in the industry. What we are looking for 5+ years in the industry as a DevOps, SRE Engineer or Platform Engineer, ideally in gaming, mobile apps, or other high-scale digital products. Strong hands-on experience with Kubernetes in production - not just running workloads on it, but operating it. Cost-aware infrastructure decision-making. Solid Terraform (or OpenTofu) experience, with a track record of keeping IaC sustainable as it grows. Proven experience in delivering data and AI/ML solutions in production for both AWS and a working knowledge of GCP or willingness to come to speed quickly. Bonus if this experience is within the gaming industry. Comfortable owning CI/CD pipelines with common tools (GitHub Actions, GitLab CI, ArgoCD, Jenkins, or similar). Hands-on experience with cloud and Kubernetes security fundamentals, IAM/RBAC, secrets management (ex. Vault, AWS Secrets Manager, External Secrets), network policies, and integrating security checks into CI/CD pipelines. Strong instincts for observability, monitoring, and alerting, you've built dashboards and alerts that teams actually rely on, and you know the difference between a useful page and noise. Hands-on with tools like Prometheus, Grafana, Datadog, CloudWatch, or similar. Solid incident response experience. The current data and AI/ML stack uses open source tools like Airflow, Trino, Spark, and Kubeflow. Familiarity with deploying these tools, as well as tweaking them for improved performance, is a bonus. Understanding of ML Ops best practices and common architectures is also a bonus. Hands-on knowledge of Python and/or other scripting languages. Experience creating infrastructure for both traditional and modern agentic data-intensive systems is a bonus. Focus on innovation, coupled with a mindset of continuous learning and curiosity to explore emerging AI technologies. The successful candidate will have an agile, hands-on approach to prototyping and validation, and ability to Get Stuff Done in a fast-paced environment. Excellent communication and collaboration skills necessary for working effectively with both technical and non-technical teams. Understanding how to drive results with key business stake

About the company

Group role reporting to the IT team The role will focus on AI/ML infrastructure for various group use cases from proprietary models training to efficient 3rd party tool running environments etc. This role will be working with all Group central teams (Marketing, Finance, HR, AI, IT). Role Overview You will build and maintain the high-performance infrastructure required to train and deploy AI models that impact millions of players in real-time, as well as improve productivity of everyone at Tripledot. You will serve as a bridge between the technical solutions that are created by our teams and our live game engines, as well as creating and maintaining infrastructure for internal IT needs. Within the group AI functions you'll be working with other AI/ML engineers, data engineers, analysts and product owners. Within the various studios and other central teams, you will interact with data, engineering, and product teams. Within group IT, you'll partner with Engineering, TechOps, and Security to

Apply for this position