Data Scientist

Bayer SAS

Municipality of Madrid, Spain

8 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Shift work

Languages

English

Experience level

Intermediate

Job location

Municipality of Madrid, Spain

Tech stack

JavaScript

3d Models

API

Business Logic

Artificial Neural Networks

User Authentication

Azure

Cloud Computing

Cloud Engineering

Code Review

Computer Programming

Data Cleansing

Information Engineering

ETL

Database Queries

Software Debugging

Github

Python

Key Management

KNIME

Modular Design

Performance Tuning

Software Deployment

Software Engineering

System Testing

Systems Integration

Spark

Pytest

Containerization

Data Lake

PySpark

Git Flow

Kubernetes

Deployment Automation

XGBoost

Machine Learning Operations

REST

Software Version Control

Data Pipelines

Azure

Key Vault

Databricks

Job description

models, engineer production pipelines, evaluate results, present insights to business stakeholders, develop interactive applications, and maintain the entire infrastructure. You'll tackle unstructured problems independently and deliver measurable impact across Finance, Supply Chain, HR, Procurement, and Commercial Operations. Our international team spans Poland, Germany, Spain, and India. We work with time series forecasting, statistical modeling, data engineering, and interactive analytics on a modern cloud-native stack. If you thrive on end-to-end ownership-from source systems to stakeholder dashboards-and enjoy solving ambiguous business problems with minimal guidance, we want to hear from you. Key Responsibilities End-to-End Solution Ownership Own the complete lifecycle from understanding business needs through production deployment, monitoring, and maintenance Build robust ETL/ELT pipelines on Databricks; clean, validate, and transform messy real-world data at scale Transform loosely

Requirements

defined business questions into structured solutions independently-identifying data gaps, proposing approaches, and iterating based on feedback Forecasting & Modeling Design and deploy production-grade forecasting solutions using statistical models (ARIMA, ETS, BSTS) and ML approaches (XGBoost, LightGBM, neural networks) Engineer sophisticated features: lag features, rolling statistics, external signals, calendar effects, and domain-specific transformations Implement forecast reconciliation and hierarchical aggregation for complex business structures Establish rigorous evaluation frameworks: backtesting, time series cross-validation, accuracy metrics, prediction intervals, and drift monitoring Software Engineering & Infrastructure Write production-grade Python and R code with modular architecture, comprehensive testing, error handling, and documentation Build and maintain sophisticated R Shiny applications with integrated JavaScript components Orchestrate ML pipelines using Kubeflow for automated training, validation, deployment, experiment tracking, and model versioning Manage infrastructure as code: Databricks workspaces, Azure resources, CI/CD pipelines (GitHub Actions, Azure DevOps), containerization, and secrets management Analysis, Debugging & Monitoring Troubleshoot complex issues across the full stack: data pipeline failures, model degradation, API errors, and integration problems Implement continuous monitoring: automated data quality checks, feature drift detection, performance tracking, and alerting systems Conduct root cause analysis of forecast errors, identify data anomalies, validate business logic, and communicate findings clearly Required Qualifications Technical Foundation Education & Experience: Master's or PhD with 3+ years delivering end-to-end data science solutions in production Programming: Strong Python, R and SQL proficiency Forecasting Expertise: Time series decomposition, seasonality, trend analysis, ensemble methods, probabilistic forecasting, hierarchical reconciliation Data Engineering: Databricks/Spark/PySpark, Delta Lake, ETL/ELT design, job orchestration, performance tuning KNIME: Building analytical workflows, data preprocessing, model pipelines, and system integration End-to-End Capabilities MLOps: Kubeflow pipeline orchestration, experiment tracking, model registry, automated deployment Software Engineering: Git workflows, code reviews, testing frameworks (pytest, testthat), modular design, documentation Application Development: Build RESTful APIs and R Shiny applications from scratch; handle authentication, deployment, and optimization Cloud Infrastructure: Azure services (Databricks, Blob Storage, Data Factory, Key Vault, Functions), container orche

About the company

At Bayer we're visionaries, driven to solve the world's toughest challenges and striving for a world where, "Health for all, Hunger for none" is no longer a dream, but a real possibility. We're doing it with energy, curiosity and sheer dedication, always learning from unique perspectives of those around us, expanding our thinking, growing our capabilities and redefining 'impossible'. There are so many reasons to join us. If you're hungry to build a varied and meaningful career in a community of brilliant and diverse minds to make a real difference, there's only one choice. Data Scientist About The Role Are you excited to own forecasting solutions from data extraction to production deployment? We're hiring a Senior Data Scientist for the Machine Learning & Artificial Intelligence unit within Bayer's Enterprise Data & Analytics Platform. This is not a pure modeling role. You will extract data from source systems, wrangle messy real-world data, build statistical and ML forecast

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all