Data Engineer - Science

Qureight Ltd
Cambridge, United Kingdom
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Cambridge, United Kingdom

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Big Data
Cloud Computing
Computer Programming
Data Cleansing
Information Engineering
Data Infrastructure
Database Design
DevOps
Dicom
Python
Machine Learning
Meta-Data Management
Management of Software Versions
Workflow Management Systems
Data Lineage
Machine Learning Operations
Data Pipelines
GXP
Docker

Job description

As Qureight scales its AI-driven imaging platform and advances development of foundation models and disease-specific AI models, we are building the data engineering capability required to support large-scale data preparation for machine learning., We are looking for a Data Engineer to focus on preparing and managing large imaging datasets (including CT scans and DICOM metadata) for use in machine learning workflows.

This role sits within the Science function and works closely with Machine Learning Scientists as well as other Data Engineers to ensure that data is delivered in a consistent, high-quality, and efficient format ready for model development. It will focus on designing and implementing the next iteration of our data infrastructure to accelerate our integration of machine learning into clinical trials.

You can read more about one of our Senior Software Engineers here.

What you will do

  • Collaborate on designing and implementing new data infrastructure and pipelines preparing data for large-scale ML workflows
  • Care about data quality, and ensuring the pipelines you build are robust, scalable, and maintainable
  • Work with DICOM data to feed into foundation model and disease-specific imaging model development
  • Collaborate closely with Machine Learning Scientists, DevOps Engineers, and other Data Engineers to create a tight feedback loop and ensure the end-to-end process is effective and efficient
  • Ensure that our data processes have quality and compliance designed in from the start to make reproducibility, lineage tracking, and data quality painless
  • Scale pipelines to handle millions of scans - ingesting the imaging data, transforming it, filtering and structuring ready for foundation model development.

Requirements

Do you have experience in Quality control?, * Proven experience as a Data Engineer in complex, data-rich environments

  • Strong programming skills in Python
  • Experience building and maintaining production ML data pipelines, including orchestration tools such as Dagster and cloud infrastructure on AWS
  • Experience with Docker and Kubernetes based infrastructure Experience working with large datasets
  • Understanding of data preprocessing and quality control for machine learning
  • Strong collaboration skills with machine learning or technical teams

Even better if you have experience of...

  • Medical imaging data such as CT, MRI, or DICOM
  • Large-scale datasets or foundation model workflows
  • Deployment tooling (Helm and familiarity with Gitops tooling such as Flux and Kustomize)
  • Data versioning and reproducibility frameworks
  • Database design and data modelling
  • Working in regulated or GxP or ISO 13485 environments
  • Experience with ML experiment tracking or metadata management (MLFlow)

Benefits & conditions

Pulled from the full job description

  • Annual leave
  • Life insurance
  • Company pension
  • Private medical insurance
  • Enhanced maternity leave, * A comprehensive benefits package that includes an annual bonus plan, private medical insurance, life insurance, and a contributory pension scheme
  • 25 days annual leave, plus bank holidays and enhanced maternity leave
  • A diverse work environment that brings together experts in many fields, including software engineering, devops, data science, machine learning, quality assurance, regulatory affairs, and clinical operations.

About the company

Qureight's mission is to accelerate clinical trials and ensure breakthroughs in lung and heart disease reach patients without delay. Our AI-powered data and imaging curation platform enables the analysis of clinical imaging and other healthcare data, helping our customers bring treatments to market, faster. We're looking for talented people who want their work to matter. With offices in Cambridge and London, you'll join our multidisciplinary team of clinicians, scientists, and engineers. What unites us is our open culture, continuous learning mindset, and a shared mission to help biopharma run faster, smarter trials.

Apply for this position