Senior Data Scientist
Role details
Job location
Tech stack
Job description
We are seeking a creative, methodologically rigorous Senior Data Scientist to push the frontier of how we research and build classifiers from glycoproteomic data. This is a research-forward individual contributor role for someone who reaches across the full breadth of modern statistical and AI methods - classical ML, deep learning, foundation models for biology, generative approaches, and whatever the literature surfaces next - and is energized by open problems: new quantification and normalization schemes, novel feature engineering, multimodal model architectures, and the biological interpretation of model outputs., * Design, prototype, and rigorously evaluate novel classifier architectures for clinical diagnostics across oncology indications
- Lead exploratory research into new quantification, normalization, and feature engineering methods for high-dimensional glycoproteomic data
- Bring a diverse modeling toolkit - classical statistical methods, tree-based ensembles, deep learning, probabilistic and Bayesian approaches, foundation models, graph neural networks, and generative AI - and choose the right tool for the problem based on evidence rather than habit or hype
- Develop cross-validation, calibration, and uncertainty-quantification strategies that hold up to the realities of small clinical cohorts and high feature counts
- Investigate and mitigate batch, cohort, and site effects so that models generalize from discovery to bridging to locked panels
- Drive cross-indication synthesis - separate shared disease biology from indication-conditioned signal, and from nonspecific inflammatory or acute-phase axes
- Build multimodal models that combine glycan/motif information, proteomic grounding, and clinical covariates rather than relying on protein-quantity signal alone
- Translate emerging techniques from the ML, AI, and computational-biology literature into production-ready methods
- Mentor junior data scientists and raise the methodological bar across the team
Requirements
Do you have experience in Scientific publications?, * Ph.D. in Statistics, Computer Science, Computational Biology, Bioinformatics, or a related quantitative field, plus 6+ years of experience building predictive models on biological data in industry or academia; alternatively, an MS in a similar field with 8+ years of relevant experience
- Demonstrated track record of methodological innovation - first-author publications, novel methods deployed in production, open-source contributions, or comparable evidence of original work
- Deep proficiency in Python and/or R, including the modern ML stack (scikit-learn, PyTorch or TensorFlow, XGBoost/LightGBM, and similar)
- Methodological breadth across paradigms - comfortable moving between classical statistics, tree-based ML, deep learning, and modern AI (transformers, graph neural networks, foundation models, generative methods) - and the judgment to argue rigorously for one approach over another
- Strong statistical foundation: cross-validation strategy, regularization, calibration, uncertainty quantification, and handling of confounders and class imbalance
- Hands-on experience building and validating classifiers on high-dimensional, low-sample-size biological data (proteomics, glycoproteomics, transcriptomics, or genomics)
- Experience with batch-effect correction and normalization techniques, and a healthy skepticism about how those choices propagate into downstream performance estimates
- Preference will be given to candidates with experience in multimodal modeling, interpretability methods, or foundation/representation-learning approaches for biological data
- Familiarity with clinical diagnostic development - analytical and clinical validation, locking classifiers, and bridging studies - is a strong plus
- Excellent written and verbal communication: able to explain novel methods clearly to wet-lab scientists, clinicians, and fellow statisticians alike
- A genuine desire to impact patient lives and contribute to the broader scientific community