Data Scientist, SCRIBE
Role details
Job location
Tech stack
Job description
We're looking for a Data Scientist to play a leading role in SCRIBE (Structured Collection Records Interpretation and Bio-entity Extraction), an ambitious AI-driven project transforming how natural science collections are digitised and made accessible.
Based within the Natural History Museum's AI & Innovation team, you will help develop cutting-edge AI tools that automate the extraction of structured data from historic collection records and manuscripts. Working with colleagues across the Museum and partner institutions throughout the UK, your work will contribute to a major national effort to unlock more than 137 million natural science specimens held across UK collections.
SCRIBE builds on a successful pilot programme and will turn an early proof of concept into a scalable, production-ready platform that can be shared across the cultural and research sector. Using technologies such as Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) and Computer Vision (CV), you will lead the development of models for document layout detection, field extraction and semantic post-processing, integrating these into robust AI workflows and production pipelines.
You'll work closely with machine learning researchers, software engineers, curators and digitisation specialists to develop open, reusable AI tools that support large-scale biodiversity and heritage digitisation. This role offers the opportunity to apply advanced AI techniques to real-world scientific and cultural challenges, helping to shape how collections data is created, accessed and used in the future.
Requirements
Are you a Data Scientist who enjoys applying AI and machine learning to complex, real-world problems? If you're excited by the opportunity to develop production-ready AI systems with meaningful scientific and cultural impact, this could be the role for you.
You bring strong experience in machine learning and AI development, including practical experience with technologies such as LLMs, Computer Vision and deep learning workflows. You are comfortable designing, training and evaluating models, and understand how to deploy scalable AI solutions within production environments.
You enjoy working collaboratively across disciplines and can communicate effectively with both technical and non-technical stakeholders. Whether working with engineers, researchers or collections specialists, you are able to translate complex challenges into practical, well-designed solutions.
If you thrive in innovative, research-led environments and enjoy balancing experimentation with delivery, you will excel in this role. Experience working within the GLAM sector or with digitisation workflows would be advantageous, but most importantly, you are motivated by using AI to unlock access to knowledge and create tools with lasting public and scientific value.
Benefits & conditions
- 27.5 days holiday plus 8 bank holidays (full time equivalent)
- Generous defined contribution Natural History Museum Pension Scheme (employer contribution 4 - 10%)
- Season ticket, bicycle and rental loan
- Life insurance
- Free admission to our exhibitions and many other paid exhibitions at museums, galleries and institutions across London and the UK.
- Staff discount at our Museum shops and cafes
- We offer a wide variety of training initiatives and opportunities to build skills. Investing in staff development is important to us, and we are ambitious about helping staff to grow and fulfil their potential.