Data Scientist
Role details
Job location
Tech stack
Job description
Remote Hiring Remotely in USA Mid level Remote Hiring Remotely in USA Mid level The Data Scientist will lead analytical workstreams, design clinical studies, develop scoring methodologies, build data pipelines, and collaborate with clinical teams to enhance surgical performance metrics., * Study design and execution: Design and run clinical validation studies - correlating AI-derived metrics with surgical outcomes (e.g., complications, resection extent, procedure duration)
- Scoring methodology: Develop and refine composite scoring algorithms (PCA-weighted, Bayesian, or other approaches) that summarize multi-dimensional surgical performance into interpretable scores
- Statistical modeling: Apply appropriate statistical methods (logistic regression, mixed effects, survival analysis, dimensionality reduction) to clinical datasets with clustered, sparse, and heterogeneous data
- Data pipeline development: Build and maintain Python pipelines that extract, transform, and analyze data from MongoDB, PostgreSQL, and S3 at scale (hundreds to thousands of procedures)
- Data quality and integrity: Design and implement data validation checks, investigate discrepancies across data sources, and ensure reproducibility of analyses
- Clinical collaboration: Work directly with surgeons and clinical researchers to define metrics, interpret results, and refine tools based on clinical feedback
- Reporting and communication: Produce analysis reports, methodology documentation, and presentations for internal teams, clinical partners, and external stakeholders
Requirements
This role requires independent judgment about statistical methodology, comfort working with messy real-world clinical data, and the ability to communicate complex findings to both technical and clinical audiences., * Master's degree (or equivalent experience) in statistics, biostatistics, data science, computer science, or a related quantitative field
- 2+ years of experience in applied data science or quantitative research
- Strong Python skills for data analysis and pipeline development (pandas, NumPy, SciPy, scikit-learn)
- Solid understanding of statistical methods: regression, hypothesis testing, dimensionality reduction (PCA/factor analysis), bootstrap inference
- Experience with SQL databases (PostgreSQL preferred) and NoSQL databases (MongoDB)
- Ability to work independently on ambiguous problems - scoping analyses, choosing methods, and communicating trade-offs
- Strong written communication - ability to produce clear reports for both technical and non-technical audiences
- Experience with Git and collaborative software development practices
Preferred
- Experience with healthcare, clinical, or biomedical data
- Familiarity with Bayesian methods or mixed-effects models
- Experience with cloud infrastructure (AWS - S3, SageMaker, or similar)
- Experience building interactive dashboards or data visualization tools
- Familiarity with surgical workflow, medical devices, or clinical methodology