Data Scientist, Portfolio Optimization
Formation, Inc.
New York, United States of America
7 days ago
Role details
Contract type
Internship / Graduate position Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Junior Compensation
$ 202KJob location
New York, United States of America
Tech stack
Artificial Intelligence
Airflow
Data Deduplication
Data Visualization
Python
NumPy
SciPy
Data Processing
Large Language Models
Pandas
Plotly
Machine Learning Operations
Streamlit Framework
Data Pipelines
Job description
- Work with the team to implement and maintain core portfolio engine: order management system, execution simulation layer, portfolio construction service, and performance tracking
- Design risk frameworks that quantify exposure across a portfolio of drug development bets with radically different risk profiles, timelines, and failure modes
- Run rigorous backtesting experiments with strict temporal constraints to evaluate Formation strategies against baseline approaches and measure marginal signal from new evidence sources
- Coordinate across the organization to integrate internal Formation data sources (clinical trial data, genomic evidence, real-world data) and proprietary tooling into portfolio analytics pipelines
- Work with product and engineering teams to build dashboards and reporting that communicate portfolio performance, risk metrics, and strategy comparisons to both technical and executive stakeholders
- Collaborate with the broader data science team to ensure portfolio-level evaluation feeds back into model improvement and evidence prioritization
Requirements
- MS or PhD in a quantitative field (statistics, finance, physics, computational science, engineering, or related)
- 1-3 years in a quantitative research, data science, or analytics role - finance, healthcare, academic research, or consulting all count; substantive internships qualify
- Strong Python programming skills with experience in data-intensive workflows (pandas, numpy, scipy)
- Solid grasp of core portfolio construction and risk concepts: position sizing, rebalancing, Sharpe ratio, drawdown, volatility, benchmark comparison
- Demonstrated ability to work with messy, real-world datasets - comfortable with data wrangling, deduplication, and quality assessment
- Clear communicator who can present quantitative results to both technical peers and business stakeholders, * Experience with backtesting frameworks or portfolio simulation (vectorbt, Backtrader, or custom implementations)
- Exposure to healthcare, pharma, or biotech data (clinical trials, claims data, -omics, real-world evidence)
- Familiarity with alternative data in a research or investment context
- Experience with probability-of-success modeling, drug development decision analysis, or health economics
- Comfort with LLMs or AI/ML pipelines in a production or research setting
- Familiarity with dashboard/visualization tools (Streamlit, Plotly, Dash) and pipeline orchestration (Dagster, Airflow)
Healthcare OR finance domain knowledge is valued; both are not required.
Benefits & conditions
Total Compensation Range: $154,500 - $202,000
Compensation Individual compensation is determined by several factors, including role scope, geographic location, and skills & experience. Your offer will reflect where you fall within the range based on these considerations. In addition to base salary, we offer equity, comprehensive benefits, and generous perks. If the posted range doesn't match your expectations, we still encourage you to apply!
About the company
Formation Bio is a tech and AI driven pharma company differentiated by radically more efficient drug development.
Advancements in AI and drug discovery are creating more candidate drugs than the industry can progress because of the high cost and time of clinical trials. Recognizing that this development bottleneck may ultimately limit the number of new medicines that can reach patients, Formation Bio, founded in 2016 as TrialSpark Inc., has built technology platforms, processes, and capabilities to accelerate all aspects of drug development and clinical trials. Formation Bio partners, acquires, or in-licenses drugs from pharma companies, research organizations, and biotechs to develop programs past clinical proof of concept and beyond, ultimately helping to bring new medicines to patients. The company is backed by investors across pharma and tech, including a16z, Sequoia, Sanofi, Thrive Capital, John Doerr, Spark Capital, SV Angel Growth, and others., Formation Bio is a tech-driven pharma company differentiated by radically more efficient drug development. Formation Bio has built a technology platform that optimizes all aspects of drug development, enabling more efficient trial design, faster trial completion, and higher quality trial data capture.
Formation Bio acquires clinical-stage drugs from pharma and biotech and develops them faster and more efficiently, unlocking greater value per program and accelerating access to new treatments for patients.
Join our culture of innovation where your work directly contributes to transforming patient care in areas such as rheumatology, dermatology, CNS, and cardiometabolic diseases. Our dynamic environment blends advanced technology with strategic drug development, speeding up the delivery of new treatments. Here, every role plays a part in our mission to bring new treatments to patients faster and more efficiently.