Lead Data Scientist
Role details
Job location
Tech stack
Job description
the obvious. We're not looking for someone who fits an ARIMA model and calls it done. We want someone who knows when to reach for Gaussian processes, gradient-boosted ensembles, neural state-space models, or hybrid symbolic-statistical approaches - and, critically, knows why.Integrate alternative data sources. Satellite imagery, shipping data, weather signals, procurement index feeds, news sentiment - the edge is often in the signal no one else has thought to use. You'll identify and incorporate these into production-grade pipelines.Shape how predictions become decisions. The end goal is for your models to inform what Monq recommends inside live procurement negotiations. Getting there requires working closely with product and engineering - translating probabilistic outputs into something a procurement professional can act on in the moment, not just admire in a dashboard.Bridge research and engineering to ship production-grade systems. You won't be throwing models over the, pipelines and ask you to improve on a baseline. This one doesn't. There is no baseline. You'll spend real time figuring out what data is available, what's worth acquiring, and what's even feasible to predict - before writing a single model.That's harder than most JDs admit. But it also means your decisions have a direct and permanent impact on the direction of the feature we are developing. The procurement market is a $4.2 trillion opportunity that existing AI solutions have almost entirely ignored. If you can build something that genuinely predicts commodity price movements - even imperfectly, even partially - it changes what Monq can offer enterprise customers and how we compete. This is the kind of problem that a good data scientist can spend years on and still find interesting.About MonqMonq is building the first AI platform purpose-built for strategic procurement negotiation. We're early-stage and moving fast - backed by executives from Revolut and HSBC, working with enterprise customers, and actively building the team that will define what this product becomes.We're a small, flat team. We use AI tools not as a novelty but because they make us better and faster. We value simplicity, ownership, and shipping - and we're looking for people who hold themselves to high standards while staying pragmatic about what matters right now.Equal OpportunitiesMonq is committed to creating a diverse and inclusive workplace and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, colour, religion, gender, gender reassignment, marital or civil partnership status, age, disability, pregnancy or maternity, or any other basis as protected by the Equality Act 2010.We actively encourage applications from people with diverse backgrounds and experiences.For accommodations during the recruitment process, please contact recruitment@monq.io. Similar jobs, Lead Data Scientist, Computer Vision London, UK (Hybrid) | Full-timeHelp reinvent how the world discovers fashion.We're building a new era in fashion discovery - one that ignites body confidence in everybody and every body. Imagine Shazam meets Spotify, but for..., Role OverviewWe are seeking a client-facing Pre-Sales Solution Architect - AI & Intelligent Automation to drive AI-led solutioning across pursuits, RFPs, strategic pitches, and existing DPO accounts.This role sits at the intersection of sales, architecture, and business...
Requirements
fence. You'll work in close collaboration with our engineering team to take research from notebook to production - defining clean interfaces, writing model-serving code that engineers can build on, and making sure what you've validated in a research context actually holds up in a live enterprise environment. The expectation is that your work ships, not just publishes.You Might Be a Fit IfYou have 6+ years of experience in applied data science or quantitative research, with a strong track record in forecasting or time series modelling in production environments (an ongoing PhD or track record of research would be wonderful to have)You've worked on commodity, energy, or financial market price prediction - you understand basis risk, seasonality, mean-reversion, and regime shifts intuitivelyYou're fluent in multivariate modelling: VAR/VECM, Bayesian hierarchical models, factor models, LSTM/transformer-based temporal architectures - and you can speak clearly to the tradeoffs between themYou're rigorous about uncertainty. You know the difference between epistemic and aleatoric uncertainty, and you build that distinction into how you communicate predictions to stakeholdersYou're comfortable working with messy, heterogeneous, real-world data - incomplete time series, mixed frequencies, structural breaks, and sources that require significant wrangling before they're usefulYou can write production-quality Python and know how to deploy models in a way that engineers can actually build onYou care about impact, not just accuracy metrics. A model that moves a negotiation outcome is worth more than one that wins a Kaggle leaderboardNice to HaveExperience with causal inference methods applied to market dynamics (synthetic control, difference-in-differences, IV)Familiarity with procurement indices (PPI, ISM, commodity spot/futures markets) and how to incorporate forward curve dataExperience building real-time or near-real-time inference pipelines at scaleBackground in operations research or supply chain optimisationExposure to LLMs as signal sources - extracting structured market intelligence from unstructured textML Skills We're Looking ForThis role sits at the intersection of classical econometrics and modern machine learning. You don't need to be a world-class expert in every area below - but you should be genuinely strong across most of them and honest about where you want to grow.Supervised & Ensemble Methods. Gradient-boosted trees (XGBoost, LightGBM, CatBoost) for tabular forecasting; understanding of when tree-based models outperform neural approaches on structured data, and vice versa. Strong intuition for regularisation, hyperparameter tuning, and avoiding leakage in time series cross-validation.Deep Learning for Sequences. Hands-on experience with temporal architectures - LSTMs, GRUs, Temporal Fusion Transformers, N-BEATS, or similar. Understanding of attention mechanisms and when transformer-based sequence models are worth the complexity cost over simpler recurrent approaches.Probabilistic & Bayesian Modelling. Comfort with probabilistic forecasting: quantile regression, conformal prediction, Monte Carlo dropout, or full Bayesian inference via PyMC or NumPyro. The ability to communicate uncertainty intervals credibly to non-technical stakeholders is as important as computing them correctly.Feature Engineering at Scale. Lag features, rolling statistics, Fourier transforms for seasonality decomposition, target encoding with temporal leakage guards, embeddings for categorical market variables. You understand that the quality of your features usually matters more than the choice of model.Model Evaluation & Validation. Walk-forward validation, purged k-fold cross-validation, backtesting under realistic execution constraints. You know why naive train/test splits are dangerous in time series and what to do about it.MLOps & Productionisation. Experience taking models from notebook to production: experiment tracking (MLflow, W&B), model versioning, feature stores, drift detection, and retraining triggers. You can build a model that doesn't just work once - it works reliably over time as markets evolve.Explainability & Interpretability. SHAP values, partial dependence plots, and the ability to explain model behaviour to procurement professionals who need to trust and act on predictions. Black-box accuracy means nothing if the model can't be interrogated when it's wrong.The StackYou'll have significant input into tooling choices: experiment tracking, feature stores, deployment infrastructure. We have strong engineering support and ship on AWS, but we're building the MLOps layer on the go and you'll help define it. We use AI coding tools - Cursor and Claude Code - as part of the daily workflow, not as a novelty.Why This Is a Rare OpportunityMost data science roles hand you a well-defined problem with existing