Data Engineer

ESML SD Iberia Holding S A U

Municipality of San Vicente del Raspeig / Sant Vicent del Raspeig, Spain

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Municipality of San Vicente del Raspeig / Sant Vicent del Raspeig, Spain

Tech stack

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Azure

Cloud Computing

Dataspaces

DevOps

Github

Python

Machine Learning

Prometheus

Azure

Workflow Management Systems

Data Logging

Cloud Platform System

PyTorch

Large Language Models

Containerization

AI Platforms

Gitlab-ci

Scikit Learn

Kubernetes

Bicep

Machine Learning Operations

Cloud Optimization

Cloudwatch

Terraform

GPT

Docker

Jenkins

Job description

We are looking for a MLOps / AIOps / LLMOps / AgentOps Engineer to join a multidisciplinary Data & AI team. The main mission of this role is to design, operate, and continuously evolve our AIOps platform, ensuring that our AI products run in a reliable, scalable, and cost-efficient way.

This position is strongly focused on platform, infrastructure, automation, observability, and operations rather than on building ML models or AI products themselves.

You will work with modern cloud technologies (mainly AWS, with some Azure exposure) and collaborate closely with Data Scientists, Data Engineers, and Product teams to bring AI solutions into production and keep them running smoothly.

We are open to candidates with strong expertise in at least one core area (e.g. cloud, DevOps, platform engineering, or ML operations) and solid foundational knowledge in the others, with motivation to grow across the full AI operations stack., * Design, maintain, and evolve the AIOps platform supporting:

Traditional machine learning models in production
LLM-based solutions such as RAG pipelines and AI Agents
Speech Analytics use cases (ASR, conversation analysis, NLP)

Build and operate ML and LLM pipelines with a strong focus on:

Reliability, automation, and observability
Model and LLM quality, performance, and drift monitoring
Cloud cost control and optimization

Implement LLMOps / AgentOps practices, including:

LLM evaluation and observability
Prompt management, traceability, and specialized logging
Agent integration, orchestration, and lifecycle management

Ensure continuous operation of AI products, including:

Alerts, dashboards, SLOs / SLIs
Scalability strategies and basic auto-remediation mechanisms

Manage deployments in cloud environments (AWS / Azure) and container platforms (Docker / Kubernetes)
Collaborate closely with Data Scientists and Data Engineers to productionize robust, scalable AI solutions
Contribute to internal standards, automation, and best practices across the AI and data ecosystem

Requirements

Hands-on experience in MLOps, AIOps, or operating ML systems in production
Solid understanding of LLMOps and AgentOps concepts (RAGs, agents, evaluation, monitoring)
Experience working with AWS and/or Azure in production environments
Practical knowledge of containers and Kubernetes (Docker, basic Helm usage, etc.)
Experience with CI/CD pipelines (GitHub Actions, GitLab CI, Azure DevOps, Jenkins, or similar)
Familiarity with observability and monitoring concepts (CloudWatch, OpenTelemetry, Prometheus, etc.)
Experience managing infrastructure as code (Terraform, Bicep, CDK, or similar)
Python experience and familiarity with the ML ecosystem (e.g. scikit-learn, PyTorch), even if not a Data Scientist
Good understanding of the ML / LLM lifecycle, from development to production and monitoring
Fluent English to work in an international environment

Nice to Have (Not Required, but Valuable)

Experience with ML/AI platforms such as SageMaker, Azure ML, MLflow, Kubeflow
Exposure to Speech Analytics technologies (ASR, diarization, conversational NLP)
Experience with cloud cost optimization / FinOps, especially for AI workloads
Experience building or operating AI agents, copilots, or conversational systems
Familiarity with LLM frameworks (LangChain, LlamaIndex, Semantic Kernel, etc.)
Experience with workflow and orchestration tools (Airflow, Argo, Step Functions, Durable Functions)

Professional Skills & Mindset