Senior Data Scientist
Role details
Job location
Tech stack
Job description
-
Design, implement, and validate Monte Carlo simulation models for clinical trial outcome prediction, enrolment forecasting, and risk quantification.
-
Develop predictive and prescriptive analytics solutions (survival analysis, time-series forecasting, causal inference) using real-world data (RWD), electronic health records, and claims data.
-
Build Bayesian hierarchical models for adaptive trial design, interim analysis support, and
probability-of-success estimation.
-
Create reproducible ML pipelines (feature engineering, model training, hyperparameter tuning, deployment) on cloud platforms (AWS, GCP, or Azure).
-
Partner with biostatistics and clinical teams to translate statistical findings into protocol amendments, site-selection strategies, and regulatory submissions.
-
Develop interactive dashboards and data products (Streamlit, Shiny, or Tableau) that communicate model outputs to non-technical stakeholders.
-
Conduct sensitivity analyses and scenario planning to quantify uncertainty in drug-development timelines and portfolio investment decisions.
-
Mentor junior data scientists and contribute to internal best practices, code reviews, and
knowledge-sharing sessions.
Requirements
-
Master's or Ph.D. in Statistics, Biostatistics, Data Science, Applied Mathematics, Computational Biology, or a related quantitative discipline.
-
5-8 years of hands-on experience building predictive models in a healthcare, pharma, or biotech setting.
-
Strong proficiency in Python (NumPy, SciPy, pandas, scikit-learn, PyMC / Stan) and/or R.
-
Demonstrated expertise in Monte Carlo methods (MCMC, importance sampling, bootstrapping) and stochastic simulation.
-
Experience with survival analysis (Cox PH, Kaplan-Meier, competing risks) and longitudinal/
mixed-effects models.
-
Working knowledge of clinical trial design (Phase I-IV), ICH-GCP guidelines, and regulatory data standards (CDISC, SDTM, ADaM).
-
Proficiency with SQL and cloud-based data infrastructure (Snowflake, Redshift, BigQuery, Databricks).
-
Excellent communication skills with the ability to present complex quantitative results to clinical, regulatory, and executive audiences.