Senior Data Scientist

STAT SOLUTIONS, INC.
7 days ago

Role details

Contract type
Contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Amazon Web Services (AWS)
Azure
Google BigQuery
Cloud Database
Code Review
Computational Biology
Python
Monte Carlo Methods
NumPy
SciPy
SQL Databases
Tableau
Feature Engineering
Snowflake
Electronic Medical Records
Pandas
Scikit Learn
Machine Learning Operations
Streamlit Framework
Redshift
Databricks

Job description

  • Design, implement, and validate Monte Carlo simulation models for clinical trial outcome prediction, enrolment forecasting, and risk quantification.

  • Develop predictive and prescriptive analytics solutions (survival analysis, time-series forecasting, causal inference) using real-world data (RWD), electronic health records, and claims data.

  • Build Bayesian hierarchical models for adaptive trial design, interim analysis support, and

probability-of-success estimation.

  • Create reproducible ML pipelines (feature engineering, model training, hyperparameter tuning, deployment) on cloud platforms (AWS, GCP, or Azure).

  • Partner with biostatistics and clinical teams to translate statistical findings into protocol amendments, site-selection strategies, and regulatory submissions.

  • Develop interactive dashboards and data products (Streamlit, Shiny, or Tableau) that communicate model outputs to non-technical stakeholders.

  • Conduct sensitivity analyses and scenario planning to quantify uncertainty in drug-development timelines and portfolio investment decisions.

  • Mentor junior data scientists and contribute to internal best practices, code reviews, and

knowledge-sharing sessions.

Requirements

  • Master's or Ph.D. in Statistics, Biostatistics, Data Science, Applied Mathematics, Computational Biology, or a related quantitative discipline.

  • 5-8 years of hands-on experience building predictive models in a healthcare, pharma, or biotech setting.

  • Strong proficiency in Python (NumPy, SciPy, pandas, scikit-learn, PyMC / Stan) and/or R.

  • Demonstrated expertise in Monte Carlo methods (MCMC, importance sampling, bootstrapping) and stochastic simulation.

  • Experience with survival analysis (Cox PH, Kaplan-Meier, competing risks) and longitudinal/

mixed-effects models.

  • Working knowledge of clinical trial design (Phase I-IV), ICH-GCP guidelines, and regulatory data standards (CDISC, SDTM, ADaM).

  • Proficiency with SQL and cloud-based data infrastructure (Snowflake, Redshift, BigQuery, Databricks).

  • Excellent communication skills with the ability to present complex quantitative results to clinical, regulatory, and executive audiences.

Apply for this position