AI Research Scientist II - Office of the CTO
Role details
Job location
Tech stack
Job description
- Develop and deploy large scale ML models that deepen our understanding of complex biological systems in health and disease by incorporating diverse biological datatypes, such as multi-omics data, microscopy data, in vivo and behavioral imaging data, electrophysiological data, and/or clinical data. Using ML methods to tokenize and embed biological data for ML models and subsequently analyze ML embeddings for benchmarking and other biological tasks
- Help to establish community standards for scalability in developing, disseminating, and evaluating AI/ML/computational methods for scientific problems across the model lifecycle by testing and documenting best practices and sharing with the community
- Collaborate with teams of scientists, computational biologists, and software engineers within the Allen Institute and external partners to help drive large scale AI/ML models, from best code practices to scientific impact
- Collaborate with software engineers to build state-of-the-art engineering infrastructure at the Allen Institute to support large scale AI/ML research and applications, such as methods for caching and processing petabyte scale data across multiple networked GPU nodes on cloud, this person will build a research grade large scale ML infrastructure that will be hardened by SWE
- Participate in institute-wide initiatives, workshops, and seminars to promote cross-disciplinary collaboration and knowledge sharing
- Support the promotion of open science through publishing papers and open-source code
Note: Reasonable accomodations may be made to enable individuals with disabilities to perform the essential functions. This description reflects management's assignment of essential functions; it does not proscribe or restrict the tasks that may be assigned.
Requirements
- PhD in Computer Science, Applied Mathematics, Computational Biology, Statistics, Biostatistics or similar field; or equivalent combination of degree and experience
- Minimum of 2 years postdoctoral / work experience
- Demonstrated ability to design, implement and apply AI/ML models for the analysis of large-scale biological data
Preferred Education and Experience
- 2 - 5 years of experience developing and applying ML methods
- Strong publication record of innovative scientific accomplishments (both individual and team)
- Expertise in Python-based ML libraries and frameworks such as PyTorch, Jax, Pyro, NumPy, and Pandas. Solid understanding of statistical analysis, data preprocessing, feature selection, and model evaluation techniques
- Experience building data pipelines to make biological data ML-ready, pipeline for model training and evaluation. Knowledge of data preprocessing, normalization, and integration techniques specific to biological and clinical datasets
- Experience with data visualization and presentation of complex biological findings to both technical and non-technical audiences
- Strong problem-solving skills and ability to develop innovative computational approaches to address complex biological questions
Physical Demands
- Fine motor movements in fingers/hands to operate computers and other office equipment
Position Type/Expected Hours of Work
- This role is currently able to work both remotely and onsite in a hybrid work environment. We are a Washington State employer, and the primary work location for all Allen Institute employees is 615 Westlake Ave N.; any remote work must be performed in Washington State.
Benefits & conditions
Employees (and their families) are eligible to enroll in benefits per eligibility rules outlined in the Allen Institute's Benefits Guide. These benefits include medical, dental, vision, and basic life insurance. Employees are also eligible to enroll in the Allen Institute's 401k plan. Paid time off is also available as outlined in the Allen Institutes Benefits Guide. Details on the Allen Institute's benefits offering are located at the following link to the Benefits Guide