Senior Bioinformatics Data Engineer
Nucleome Therapeutics Ltd
Oxford, United Kingdom
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
£ 65KJob location
Oxford, United Kingdom
Tech stack
Bioinformatics
Cloud Computing
Computer Programming
Databases
Corona (Software Development Kit)
Information Engineering
Data Files
DevOps
Elasticsearch
Python
PostgreSQL
DataOps
Data Streaming
Data Processing
Containerization
Core Data
Storage Technologies
Non-relational Database
Data Pipelines
Job description
- Driving the design, development, and continuous improvement of core data infrastructure that enables automated data flows from unstructured lab data through to structured and highly accessible enriched data. You will work closely with the lead software engineer to make technical and architectural decisions, collaborating on strategy and solution design while taking ownership of implementation and working independently to solve complex data engineering challenges. Together, you will iterate on solutions and ensure best practices are maintained across the platform.
- Proactively identifying opportunities to enhance and optimize existing data pipelines that are vital to the business's decision-making processes. The lab team relies on these pipelines to generate the critical data needed for analysis on drug target viability, making your work directly impactful to therapeutic development priorities. You will work closely with the computational team to prioritize improvements, implement iterative enhancements, and ensure pipelines remain robust, scalable, and aligned with evolving scientific and business requirements.
- Collaborating closely with lab and computational teams to ensure data is readily available, reliable, and properly integrated across the platform. You will take ownership of data quality, accessibility, and performance, making decisions on data modelling, storage strategies, and processing optimizations while maintaining alignment with the broader engineering team's standards and practices.
Requirements
BSc, MSc or PhD in a mathematical, computational or science discipline. Industry experience in biotech or pharma would be desirable.
- Extensive practical experience defining and developing scalable biological data processing pipelines, with knowledge of Nextflow and Seqera being advantageous. Experience with bioinformatics tools, databases, biological datasets, and performant data file formats (parquet etc.) alongside statistical analysis.
- Experience in data modelling and implementation in relational and non-relational databases such as PostgreSQL and Elasticsearch, as well as cloud-based data processing and storage technologies, infrastructure, and containerization.
- Strong programming skills in Python, extensive experience with modern data engineering tooling and orchestration platforms such as Dagster, and experience with the processes and tooling of modern software development, DevOps, and DataOps.
- Strong communication, organisational and time management skills and the ability to communicate complex ideas to technical and non-technical audiences. Ability to work independently and as a member of a multidisciplinary team in a highly dynamic environment.
- Ability to be very detail orientated in implementation whilst being aware of the big picture from a planning, time scale and business value perspective.