Forward Deployed Data Engineer AI & Robotics

The Ellison Institute Of Technology (eit)
Oxford, United Kingdom
7 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Oxford, United Kingdom

Tech stack

Business Logic
Bioinformatics
Clinical Data Management
Clinical Data Repository
Software Quality
Computer Programming
Continuous Integration
Information Engineering
ETL
Data Transformation
Distributed Data Store
NetCDF
Rapid Prototyping Process
Role-Based Access Control
Feature Engineering
Core Data
Data Pipelines
Microservices

Job description

The Role:Forward Deployed Data Engineersare a keyinterfacebetween our coredata systems and our researchandproduct engineeringprojects.As an early member of ourrapidly growingteam,you'llwork side-by-side with scientists and engineers in bioinformatics, healthcare, robotics, agriculture, andfrontierAI. You willdisseminatedata engineering best practicesinto theresearch and appliedteams,ensuringthat the datasetsusedfor ourAI modelsareaccurate,reproducible, versioned-controlled,well-structured, and readyfor training. You will help scientists turn raw information from various sources into high-quality resources, ensuring our technical foundations support the next generation of discovery.

This is a hands-on role for engineers who thrive on collaborating directly with researchers, solving problems quickly, and turning complex business logic into scalable, reliable data pipelines while contributing to the broader EIT platform. Successful candidates will be clear, respectful communicators who are comfortable bringing their ownexpertiseinto diverse groups.

Day-to-Day, You Might:

Partner with scientists and engineers to deliver robust, reproducible data pipelines that meet research needs across disciplines.

Own ingestion, storage, curation, and transformationof multimodal data, including text, images,structured/tabular data, and high I/O formats such asArrow/NetCDF/HDF5.

Package and deploy code in research environments using containers (e.g.Docker).

Scale processingacross distributed cloud warehouses/storagevia container orchestration (e.g.Kubernetes),distributed compute frameworks (e.g.Spark, Ray),orHigh-Performance Compute (e.g.Slurm).

Work with sensitive data in line with security and compliance requirements (audit trails, encryption,GDPR, RBAC/ABAC).

Contribute to an engineering culture that values maintainability, testing,robustsystem design, and deep collaboration, but allows flexibility for rapid prototyping andresponsiveness to changing landscapes., Our Forward Deployed Engineers will have the opportunity to work on diverse projects,and to expand their skillset,but will addparticular valuewhen they can match their data engineeringexpertisewithdomain knowledge relevant toaproject.

Does oneof these domains fit you?Health/ClinicalData Engineering:

In-depth knowledge andexpertisein human healthcaredata, clinical data, patient journey and/orbiocuration

Feature engineering for predictive modelsutilisinghealth data

Transformation of clinical data into commondatamodels, and design of new models

Clinical data quality control and analytics

Differential privacy systems, PII-handling, andanonymisation/pseudonymisation

Requirements

You have strong programming experience in Python and SQL, and value code quality, reliability (including testing, CI/CD) and observability as much as performance

You have experience designing, deploying, and optimising distributed data systems or data-intensive backend services

You think in terms of systems and longevity, not just one-off ETL scripts, and embrace end-to-end ownership from low-level performance to user interfaces

You're a collaborative partner to Infrastructure/Ops teams and researchers; clear, respectful communicator.

You have a low-ego, team-first mindset and help grow our engineering culture by mentoring, sharing, and elevating the work of those around you, Experience inbioinformaticsanduse ofindustry-standardtoolingsuch asNextFlowor similar

Genomic/metagenomicdata processinge.g.genome assembly and annotation

Comfortable with relevant data formats such asfastq/fasta,vcfetc

Conversion of genomic data into ML-ready formats

About the company

At the Ellison Institute of Technology (EIT), we're on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity's greatest challenges in four transformative areas, This is ambitious work - work that demands curiosity, courage, and a relentless drive to make a difference. At EIT, you'll join a community built on excellence, innovation, tenacity, trust, and collaboration, where bold ideas become real-world breakthroughs. Together, we push boundaries, embrace complexity, and create solutions to scale ideas for lab to society. Explore more at www.eit.org, Why work for EIT:At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!

Apply for this position