Data Scientist

Bayer SAS
Municipality of Madrid, Spain
8 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Intermediate

Job location

Municipality of Madrid, Spain

Tech stack

JavaScript
3d Models
API
Business Logic
Artificial Neural Networks
User Authentication
Azure
Cloud Computing
Cloud Engineering
Code Review
Computer Programming
Data Cleansing
Information Engineering
ETL
Database Queries
Software Debugging
Github
R
Python
Key Management
KNIME
Modular Design
Performance Tuning
Software Deployment
Software Engineering
System Testing
Systems Integration
Spark
Pytest
Containerization
Data Lake
PySpark
Git Flow
Kubernetes
Deployment Automation
XGBoost
Machine Learning Operations
REST
Software Version Control
Data Pipelines
Azure
Key Vault
Databricks

Job description

models, engineer production pipelines, evaluate results, present insights to business stakeholders, develop interactive applications, and maintain the entire infrastructure. You'll tackle unstructured problems independently and deliver measurable impact across Finance, Supply Chain, HR, Procurement, and Commercial Operations. Our international team spans Poland, Germany, Spain, and India. We work with time series forecasting, statistical modeling, data engineering, and interactive analytics on a modern cloud-native stack. If you thrive on end-to-end ownership-from source systems to stakeholder dashboards-and enjoy solving ambiguous business problems with minimal guidance, we want to hear from you. Key Responsibilities End-to-End Solution Ownership Own the complete lifecycle from understanding business needs through production deployment, monitoring, and maintenance Build robust ETL/ELT pipelines on Databricks; clean, validate, and transform messy real-world data at scale Transform loosely

Requirements

defined business questions into structured solutions independently-identifying data gaps, proposing approaches, and iterating based on feedback Forecasting & Modeling Design and deploy production-grade forecasting solutions using statistical models (ARIMA, ETS, BSTS) and ML approaches (XGBoost, LightGBM, neural networks) Engineer sophisticated features: lag features, rolling statistics, external signals, calendar effects, and domain-specific transformations Implement forecast reconciliation and hierarchical aggregation for complex business structures Establish rigorous evaluation frameworks: backtesting, time series cross-validation, accuracy metrics, prediction intervals, and drift monitoring Software Engineering & Infrastructure Write production-grade Python and R code with modular architecture, comprehensive testing, error handling, and documentation Build and maintain sophisticated R Shiny applications with integrated JavaScript components Orchestrate ML pipelines using Kubeflow for automated training, validation, deployment, experiment tracking, and model versioning Manage infrastructure as code: Databricks workspaces, Azure resources, CI/CD pipelines (GitHub Actions, Azure DevOps), containerization, and secrets management Analysis, Debugging & Monitoring Troubleshoot complex issues across the full stack: data pipeline failures, model degradation, API errors, and integration problems Implement continuous monitoring: automated data quality checks, feature drift detection, performance tracking, and alerting systems Conduct root cause analysis of forecast errors, identify data anomalies, validate business logic, and communicate findings clearly Required Qualifications Technical Foundation Education & Experience: Master's or PhD with 3+ years delivering end-to-end data science solutions in production Programming: Strong Python, R and SQL proficiency Forecasting Expertise: Time series decomposition, seasonality, trend analysis, ensemble methods, probabilistic forecasting, hierarchical reconciliation Data Engineering: Databricks/Spark/PySpark, Delta Lake, ETL/ELT design, job orchestration, performance tuning KNIME: Building analytical workflows, data preprocessing, model pipelines, and system integration End-to-End Capabilities MLOps: Kubeflow pipeline orchestration, experiment tracking, model registry, automated deployment Software Engineering: Git workflows, code reviews, testing frameworks (pytest, testthat), modular design, documentation Application Development: Build RESTful APIs and R Shiny applications from scratch; handle authentication, deployment, and optimization Cloud Infrastructure: Azure services (Databricks, Blob Storage, Data Factory, Key Vault, Functions), container orche

About the company

At Bayer we're visionaries, driven to solve the world's toughest challenges and striving for a world where, "Health for all, Hunger for none" is no longer a dream, but a real possibility. We're doing it with energy, curiosity and sheer dedication, always learning from unique perspectives of those around us, expanding our thinking, growing our capabilities and redefining 'impossible'. There are so many reasons to join us. If you're hungry to build a varied and meaningful career in a community of brilliant and diverse minds to make a real difference, there's only one choice. Data Scientist About The Role Are you excited to own forecasting solutions from data extraction to production deployment? We're hiring a Senior Data Scientist for the Machine Learning & Artificial Intelligence unit within Bayer's Enterprise Data & Analytics Platform. This is not a pure modeling role. You will extract data from source systems, wrangle messy real-world data, build statistical and ML forecast

Apply for this position