Assistant Data Scientist - Computational Drug Discovery & Molecular Modeling
Role details
Job location
Tech stack
Job description
We are seeking an intellectually curious and scientifically grounded Assistant Data Scientist to join our cutting-edge discovery therapeutics division. In this early-career role, you will sit at the vital interface of advanced computer science and molecular biology, deploying machine learning workflows and rigorous statistical analyses to help accelerate the discovery of next-generation medicines. As an embedded member of a cross-functional scientific computing group, you will address complex data problems across multiple modalities. You will have a direct, hands-on influence on analyzing the data outputs of state-of-the-art automated equipment and high-throughput screening platforms. This role is designed for a highly analytical self-starter who wants to develop property-prediction models, explore deep learning architectures, and collaborate dynamically with laboratory biologists and chemists to identify therapeutic leads., Machine Learning Engineering & Molecular Modeling Workflow Automation: Develop, optimize, and maintain predictive machine learning and deep learning workflows that enable the discovery and design of novel small molecules and peptides. Property Prediction: Contribute to the architecture and scaling of molecular property prediction models to screen for target potency, selectivity, and metabolic viability. Algorithm Research: Conduct active computational research in areas of machine learning, molecular modeling, and virtual screening applications relevant to early-stage pipeline acceleration. Infrastructure Utilization: Leverage high-performance computing (HPC) clusters in a Unix/Linux environment or cloud architectures (AWS) to manipulate high-volume molecular and bioinformatics datasets.
Interdisciplinary Collaboration & Insights Delivery Cross-Functional Synergy: Partner closely with laboratory chemists, molecular biologists, and informatics specialists to apply modeling techniques that advance molecules from lead optimization to clinical candidates. Data Synthesis & Interpretation: Provide multi-disciplinary stakeholders with an in-depth understanding of complex data outputs, interpreting analytical results to guide the physical experimental design process. Hypothesis Generation: Apply rigorous computational methods to help wet-lab scientists generate novel, testable hypotheses for cellular target discovery and molecular mechanisms of action. Technical Presentation: Communicate highly complex mathematical or algorithmic results effectively and concisely to non-technical business partners in both written formats and formal oral presentations.
Requirements
Education: Bachelor's degree in Computational Physics, Computational Chemistry, Bioinformatics, Computer Science, or a closely related quantitative, data-dense scientific field. oCandidates possessing a Master's degree or PhD in these same quantitative disciplines are highly encouraged to apply. Experience Baseline: 0 to 3 years of hands-on data science or machine learning application experience (academic research, thesis work, or industry internships will be fully considered). Programming Fluency: Deep, hands-on proficiency in Python specifically tailored for scientific computing and deep learning frameworks (e.g., NumPy, PyTorch, SciPy, or Pandas). Systems Literacy: Proven experience navigating high-performance computing clusters in a Unix/Linux OS environment, or direct familiarity with scalable cloud computing architectures (specifically AWS). Domain Alignment: A foundational, working knowledge of biochemistry, organic chemistry, or molecular biology concepts.
Preferred "Nice-to-Have" Qualifications Direct experience participating in computational research projects within an early-stage pharmaceutical drug discovery or biotechnology space. Experience developing or fine-tuning large-scale chemical foundation models, virtual screening applications, or protein structure-related models (e.g., AlphaFold/RoseTTAFold variations). Exceptional priority-balancing habits and a demonstrated ability to build strong, collegial relationships across a multicultural matrix organization.