(Senior) Data Scientist
Role details
Job location
Tech stack
Job description
- Your recommendation engine suggests a product category - opening up €50K in new monthly revenue
- Your clustering algorithm reveals that 20 customers have untapped potential worth €5M
Welcome to wholesale AI - where sparse data meets high stakes, where relationships matter more than transactions, and where the right insight at the right time can transform a business., * Build and deploy ML models processing millions of transactions across 20+ enterprise customers
- Design recommendation engines handling catalogs of 500K+ products
- Develop predictive models for customer behavior with sparse, irregular interaction patterns
- Create churn prediction and cross-sell systems integrated with ERP systems
- Own complete lifecycle from problem definition to production deployment
- Build evaluation frameworks connecting model performance to business KPIs
- Explanations matter as much as predictions
What you'll be working on
Production AI Systems at Scale
- Build and deploy ML models that process millions of transactions across 20+ (and growing) enterprise customers simultaneously
- Design systems that adapt to wildly different data distributions (a construction wholesaler vs. a medical equipment distributor)
- Create recommendation engines that handle catalogs of 500K+ products where most customers buy <5% of items
- Develop predictive models for customer behavior with sparse, irregular interaction patterns
- Design churn prediction and cross-sell systems that integrate seamlessly with ERP (Entereprise Resource Planning) systems
- Build models that generate automated proposals and personalized communication templates using LLMs.
- Combine classical ML (churn prediction, clustering) with LLMs for insight generation and sales enablement
End-to-End Ownership
- Own the complete lifecycle from problem definition to production deployment
- Build evaluation frameworks that connect model performance to business KPIs
- Design A/B testing infrastructure for continuous improvement
- Create feedback loops that learn from real-world outcomes
Complex Technical Challenges
- Handle extreme class imbalance (95%+ negative cases) while maintaining business value
- Build models that work with limited historical data (cold-start problems)
- Design architectures that scale from SMBs with 100 customers to enterprises with 100K customers
- Solve multi-objective optimization problems (maximize revenue while ensuring diversity)
Requirements
Do you have experience in Unity?, Do you have a Master's degree?, * Strong background in machine learning fundamentals - you understand why algorithms work, not just how to use them
- Experience building production data powered products systems that handle real-world messiness
- Familiarity in distributed computing (Spark/PySpark) for processing large-scale data
- Solid software engineering practices - your code is tested, documented, and maintainable
Problem-Solving Mindset
- You approach problems from first principles rather than reaching for standard solutions
- Comfortable with ambiguity - you can define success metrics when requirements are vague
- You balance technical elegance with business pragmatism
- Experience translating business problems into data science solutions
Proven Track Record In:
- Building ML systems that handle irregular patterns (time series with gaps, seasonal businesses, etc.)
- Working with hierarchical data structures (product taxonomies, customer segments)
- Creating models that provide actionable insights, not just predictions
- Deploying ML in multi-tenant architectures where one model serves many clients
Bonus Points For:
- Experience in B2B analytics, e-commerce, or supply chain optimization
- Knowledge of recommendation systems, customer analytics, or revenue optimization
- Familiarity with modern data platforms (Databricks, Snowflake, etc.)
- Experience with MLOps practices and model lifecycle management
- Understanding of European business practices and regulations
- Experience with LLM APIs, prompt engineering, or building LLM-augmented products
- Proficiency in German
Our Tech Stack
You'll be working with modern tools, but we care more about your ability to learn than specific tool experience:
- Data Processing: PySpark, SQL, Python
- ML Platform: Databricks (Unity Catalog, Workflows, Model Serving)
- ML Libraries: scikit-learn, XGBoost/LightGBM, implicit, scipy, OpenAI + experience with your preferred frameworks
- Infrastructure: AWS, Terraform
- Orchestration: Databricks Workflows, Github Actions
- Experimentation: MLflow