Bioinformatics Data Scientist

Spectrix Analytical Services, LLC
Cambridge, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Junior
Compensation
$ 83K

Job location

Cambridge, United States of America

Tech stack

API
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Business Analytics Applications
Bioinformatics
Cloud Engineering
Computational Biology
Data Structures
Decision Support Systems
Oracle Discoverer
R
Systems Analysis
Python
Machine Learning
Meta-Data Management
RStudio
Scientific Computating
Software Deployment
Software Engineering
SQL Databases
Parquet
Cloud Platform System
Flask
Large Language Models
Multi-Agent Systems
Prompt Engineering
Generative AI
GIT
FastAPI
Containerization
Data Lake
Information Technology
Free and Open-Source Software
Feature Selection
Machine Learning Operations
Streamlit Framework
Software Version Control
Data Pipelines
Docker

Job description

· Analyze and interpret large-scale proteomics and multi-omics datasets to support biomarker discovery, pharmacodynamic analysis, pathway and causality inference, disease biology, and drug discovery programs.

· Develop scalable, reproducible, and cloud-enabled analytical workflows, data pipelines, reports, dashboards, APIs, and data products for scientific users.

· Apply statistical modeling, machine learning, and AI/LLM-enabled approaches to improve biological interpretation, knowledge extraction, workflow automation, and scientific decision support.

· Integrate proteomics data with orthogonal modalities such as transcriptomics, genomics, genetics, perturbation data, metadata, and translational annotations.

· Collaborate with computational scientists, mass spectrometry scientists, discovery biologists, translational researchers, and data science teams to define analytical strategies and communicate results clearly.

· Promote best practices in data QC, reproducible analysis, workflow development, software engineering, and responsible use of AI-assisted scientific tools.

Requirements

Do you have experience in Translational research?, The ideal candidate is a data scientist with strong computational biology, statistical, machine learning, and software engineering foundations, with the ability to work across biological interpretation, cloud-based workflows, and emerging AI/LLM-enabled scientific capabilities., · PhD in Bioinformatics, Computational Biology, Data Science, Statistics, Computer Science, Systems Biology, Biology, Biochemistry, Chemistry, Engineering, or a related quantitative scientific field.

· Hands-on experience analyzing high-dimensional biological or biomedical datasets, such as proteomics, LC-MS, Olink, SomaScan, transcriptomics, single-cell data, spatial omics, genomics, genetics, or other omics/modalities.

· Strong proficiency in R and/or Python for data analysis, statistical modeling, visualization, and reproducible scientific computing.

· Solid understanding of experimental design, data QC, normalization, missing-value assessment and imputation, feature selection, and advanced statistical modeling approaches for high-dimensional biological data, including linear models, mixed-effects models, Bayesian methods, biological signal deconvolution, pathway or feature-level interpretation, and communication of biological findings.

· Experience working with scalable computing and cloud environments, including AWS services and workflow-based analysis systems for large scientific datasets.

· Proficiency with machine learning, AI, and LLM-powered analytical systems, including the development of robust agentic frameworks or multi-agent architectures for scientific workflows.

· Familiarity with drug discovery, translational research, biomarker discovery, perturbation biology, pharmacodynamic studies, disease biology, or related biomedical research contexts.

· Ability to work independently on complex analytical problems and communicate results clearly to scientific stakeholders.

Preferred Qualifications

· Postdoctoral, industry, or equivalent applied research experience after PhD.

· Direct experience with computational proteomics across one or more platforms, including LC-MS proteomics, phosphoproteomics, DIA/SWATH, DDA, TMT, label-free quantification, PTM analysis, spectral library generation and prediction, affinity-based proteomics such as Olink or SomaScan, and related analytical workflows.

· Hands-on experience with proteomics software, outputs, or data structures from tools such as DIA-NN, Spectronaut, MaxQuant, FragPipe, Proteome Discoverer, Skyline, or comparable platforms.

· Experience building reusable scientific workflows, analytical pipelines, applications, dashboards, APIs, reports, or self-service data products for scientific users.

· Experience deploying analytical workflows or data products on AWS or comparable cloud platforms using workflow orchestration, containerization, and scalable execution frameworks such as Nextflow, Snakemake, Airflow, Docker, AWS Batch, ECS, or comparable systems.

· Experience developing and deploying R and Python software and data products using technologies such as Posit/RStudio, Posit Connect, Shiny, Dash, Streamlit, FastAPI, Flask, or comparable scientific computing and application delivery platforms.

· Strong software engineering practices, including Git/version control, modular code design, documentation, testing, and reproducible workflow development.

· Hands-on experience developing LLM-enabled applications or workflows using Claude or other large language models, including RAG systems, tool-using agents, prompt engineering, evaluation frameworks, LLMOps concepts, or scientific knowledge extraction.

· Experience applying machine learning or foundation-model approaches to biological data, including representation learning, multimodal modeling, classification/regression, embedding-based retrieval, generative AI, or related methods.

· Experience with data modeling, SQL, Parquet, metadata management, data lake architectures, or large-scale biological data warehouses.

· Strong publication record, open-source contributions, or demonstrated delivery of reusable computational tools, analytical platforms, scientific workflows, or production-quality internal data products.

Strong Differentiators

· Ability to bridge computational proteomics, biological interpretation, cloud engineering, bioinformatics methodology development, and AI/ML/LLM workflow implementation for life science applications.

· Demonstrated success building tools, analytical methods, or platforms that were adopted by experimental, translational, or computational scientists.

· Contributions to peer-reviewed publications in bioinformatics, computational biology, proteomics, machine learning, systems biology, or related fields.

· Experience designing AI-assisted, machine-learning, or agentic workflows that are reproducible, traceable, scientifically reliable, and suitable for biological and biomedical research.

· Strong understanding of how to connect omics data, pathway biology, perturbation data, genetics, and drug discovery questions into reusable analytical systems.

· Ability to develop and publish novel analytical methodologies when appropriate.

· Track record of applying AI and machine learning techniques to life science datasets, including biomarker discovery, target identification, predictive modeling, knowledge extraction, or multi-omics integration.

· Ability to help shape future scientific AI strategy rather than only execute predefined analyses., * Doctorate (Required)

Benefits & conditions

Pulled from the full job description

  • Referral program
  • Professional development assistance
  • Tuition reimbursement
  • Parental leave
  • 401(k)
  • Health insurance
  • Retirement plan, Spectrix is participating in the E-Verify program of the U.S. Department of Homeland Security (phone number: 888-897-7781, and website: www.dhs.gov/E-Verify).

Job Types: Full-time, Contract

Pay: From $40.00 per hour, * 401(k)

  • 401(k) matching
  • Dental insurance
  • Employee assistance program
  • Flexible schedule
  • Flexible spending account
  • Health insurance
  • Health savings account
  • Life insurance
  • Paid time off
  • Parental leave
  • Professional development assistance
  • Referral program
  • Retirement plan
  • Tuition reimbursement
  • Vision insurance

About the company

Spectrix Analytical Services, LLC is a dedicated provider of on-site analytical scientific services, supporting research and development efforts across various industries. Since 1999, our talented scientists have delivered reliable, high-quality results by integrating seamlessly into our clients' laboratories to enhance their analytical capabilities.

Apply for this position