Data Scientist
Role details
Job location
Tech stack
Job description
models, engineer production pipelines, evaluate results, present insights to business stakeholders, develop interactive applications, and maintain the entire infrastructure. You'll tackle unstructured problems independently and deliver measurable impact across Finance, Supply Chain, HR, Procurement, and Commercial Operations. Our international team spans Poland, Germany, Spain, and India. We work with time series forecasting, statistical modeling, data engineering, and interactive analytics on a modern cloud-native stack. If you thrive on end-to-end ownership-from source systems to stakeholder dashboards-and enjoy solving ambiguous business problems with minimal guidance, we want to hear from you. Key Responsibilities End-to-End Solution Ownership Own the complete lifecycle from understanding business needs through production deployment, monitoring, and maintenance Build robust ETL/ELT pipelines on Databricks; clean, validate, and transform messy real-world data at scale Transform loosely
Requirements
defined business questions into structured solutions independently-identifying data gaps, proposing approaches, and iterating based on feedback Forecasting & Modeling Design and deploy production-grade forecasting solutions using statistical models (ARIMA, ETS, BSTS) and ML approaches (XGBoost, LightGBM, neural networks) Engineer sophisticated features: lag features, rolling statistics, external signals, calendar effects, and domain-specific transformations Implement forecast reconciliation and hierarchical aggregation for complex business structures Establish rigorous evaluation frameworks: backtesting, time series cross-validation, accuracy metrics, prediction intervals, and drift monitoring Software Engineering & Infrastructure Write production-grade Python and R code with modular architecture, comprehensive testing, error handling, and documentation Build and maintain sophisticated R Shiny applications with integrated JavaScript components Orchestrate ML pipelines using Kubeflow for automated training, validation, deployment, experiment tracking, and model versioning Manage infrastructure as code: Databricks workspaces, Azure resources, CI/CD pipelines (GitHub Actions, Azure DevOps), containerization, and secrets management Analysis, Debugging & Monitoring Troubleshoot complex issues across the full stack: data pipeline failures, model degradation, API errors, and integration problems Implement continuous monitoring: automated data quality checks, feature drift detection, performance tracking, and alerting systems Conduct root cause analysis of forecast errors, identify data anomalies, validate business logic, and communicate findings clearly Required Qualifications Technical Foundation Education & Experience: Master's or PhD with 3+ years delivering end-to-end data science solutions in production Programming: Strong Python, R and SQL proficiency Forecasting Expertise: Time series decomposition, seasonality, trend analysis, ensemble methods, probabilistic forecasting, hierarchical reconciliation Data Engineering: Databricks/Spark/PySpark, Delta Lake, ETL/ELT design, job orchestration, performance tuning KNIME: Building analytical workflows, data preprocessing, model pipelines, and system integration End-to-End Capabilities MLOps: Kubeflow pipeline orchestration, experiment tracking, model registry, automated deployment Software Engineering: Git workflows, code reviews, testing frameworks (pytest, testthat), modular design, documentation Application Development: Build RESTful APIs and R Shiny applications from scratch; handle authentication, deployment, and optimization Cloud Infrastructure: Azure services (Databricks, Blob Storage, Data Factory, Key Vault, Functions), container orche