Machine Learning Engineer

Atos SE

Municipality of Madrid, Spain

4 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Municipality of Madrid, Spain

Tech stack

Artificial Intelligence

Confluence

JIRA

Distributed Systems

Github

Monitoring of Systems

InfiniBand

Python

Linux System Administration

Machine Learning

Scrum

TensorFlow

Prometheus

Graphics Processing Unit (GPU)

High Performance Computing

PyTorch

System Availability

Model Validation

Reliability of Systems

Gitlab

Scikit Learn

Kubernetes

Information Technology

Data Analytics

Machine Learning Operations

Software Version Control

Job description

We are searching for a Machine Learning Engineer to join Bull´s innovative R&D team and contribute to the development of AI-driven solutions for infrastructure monitoring, reliability, and cybersecurity in High-Performance Computing (HPC) environments. The role focuses on leveraging large-scale operational telemetry, metrics, and logs to build predictive capabilities that improve system availability, detect anomalies, and support proactive operations. The selected candidate will be responsible not only for model development but also for rigorous validation, operationalization, and integration within a Kubernetes-based platform.

Responsibilities:

Design and develop ML/DL models for predicting hardware failures and detecting software or behavioral anomalies in HPC systems.
Apply advanced analytics techniques such as time-series forecasting, anomaly detection, classification, and predictive maintenance using large-scale monitoring data.
Build and maintain data pipelines and features from infrastructure telemetry and logs.
Perform rigorous model validation to ensure robustness, reliability, and production readiness.
Deploy and operationalize models within a Kubernetes-based environment, including scalable inference services and lifecycle management.
Contribute to AI-driven cybersecurity use cases, such as detecting abnormal behaviors, potential intrusions, or security-related anomalies in infrastructure and system activity.
Work within an Agile/Scrum environment, participating in sprint planning, stand-ups, and retrospectives.
Collaborate with system administrators, support teams, and data engineers to translate operational challenges into data-driven solutions that enhance system reliability and automation., Nunca debes compartir tus datos bancarios ni fotos de tus documentos al solicitar un empleo. Si tienes alguna duda sobre un proceso de selección En esta oferta serás redirigido a la pagina web de la empresa. Completa el formulario en su web.
Madrid - España Ubicación
Inteligencia Artificial/Machine Learning Funciones
Jornada completa Jornada
3 años Experiencia
Indefinido Tipo contrato
Python Machine Learning TensorFlow PyTorch

Requirements

Master´s or PhD in Computer Science, Artificial Intelligence, Data Science, Telecommunications or a related field.

Skills & Competencies:

Strong experience with Machine Learning and Deep Learning frameworks (e.g., TensorFlow, PyTorch, Scikit-learn).
Experience working with time-series data and anomaly detection models.
Proficiency in Python and data science ecosystems.
Experience with Prometheus or similar monitoring/telemetry systems.
Familiarity with containerization and orchestration technologies, especially Kubernetes.
Experience building production-grade ML pipelines.
Experience handling large-scale monitoring and operational datasets.
Understanding of distributed systems and infrastructure monitoring.
Knowledge of HPC environments, GPUs, and high-speed interconnects (e.g., Infiniband) is highly desirable.
Proficiency with Git-based version control systems (GitHub, GitLab).
Solid experience working in Linux environments.
Good understanding of Scrum methodology and experience with Jira and Confluence.

Benefits & conditions

Flexible Work Schedule: Half day Fridays and an intensive summer workday supporting work life balance.

Learning and Growth: Opportunities to work with advanced AI technologies in an innovative and supportive R&D environment.

Join us!

Here, your ideas, your curiosity and your technical excellence directly shape the next era of advanced computing - unlocking enterprise value, accelerating scientific progress and driving positive impact for society.

About the company

Bull is a story. One with a century of European innovation and a working environment where experts design powerful, sustainable, and sovereign digital solutions, enabling states and industries to retain full control over their data and their AI. Bull is also thousands of engineers, researchers and passionate tech people shaping the future of high-performance computing, AI, and quantum technologies. Every day, our teams push the boundaries of what is technologically possible - from next-generation HPC architectures to exascale supercomputers - supported by world-class R&D, more than 1,600 patents, and unique end-to-end capabilities spanning hardware design, software engineering, data science and quantum research. We are a people-centric, innovation-driven company, where collaboration spans Europe, the Americas and India. We share a common vision of a responsible and sustainable innovation that delivers concrete impact for our customers.