Senior AI Engineer
Role details
Job location
Tech stack
Job description
Data is core to our decision making, and we have one of the richest datasets in the industry. The Data Science team develops models which turn our data into actionable insights. It is our mission to leverage data to give the best value and experience to our customers and suppliers whilst increasing profitability and growth for Ocado Retail.
The Data Science team is responsible for understanding the complexities in the data, generating appropriate models to answer business questions, and delivering the outputs of these models to stakeholders in diverse areas of the business. It is our responsibility to work across functions to productionise these models so they are available to the right people, any time they need them. This could be in the form of a data table with the outputs of the model, or developing an interface to allow users to leverage models for their specific needs.
Our remit is to apply statistics, machine learning and artificial intelligence to our data at scale, and to do this we use Python, SQL, and the Google Cloud Platform (GCP). We are cross-functional by design, working with stakeholders across the business with varying degrees of technical expertise, meaning communicating complex ideas is core to the success of the Data Science team.
This is a new role within the Data Science team, with a focus on ownership of our AI development, implementation and quality. There will be the opportunity to shape how the business puts our AI into products for the business, for internal users, suppliers and customers.
What you'll do
- Partner directly with non-technical commercial stakeholders and business operators to deeply understand their day-to-day challenges.
- Map out manual workflows to uncover prime opportunities for agentic automation.
- Shadow operational teams and grasp the true nuances of their processes.
- Architect, design, and deploy bespoke, multi-step agentic pipelines using advanced orchestration frameworks.
- Ensure seamless integration with enterprise systems and internal APIs.
- Establish a robust, automated evaluation strategy utilising testing frameworks like DeepEval or RAGAS.
- Build LLM-as-a-judge capabilities into CI/CD pipelines.
- Enforce strict deterministic and probabilistic guardrails to guarantee safe, reliable outputs before model interaction with live business processes.
- Collaborate continuously with the Data Science team and backend software engineers.
- Ensure the highly scalable, native GCP systems you design fit perfectly within the broader technology stack.
Requirements
Do you have experience in Unit testing?, Do you have a Master's degree?, * A highly communicative and empathetic engineer who genuinely enjoys working closely with non-technical stakeholders to uncover the root of business inefficiencies
- You have a proven ability to translate operational workflows into robust technical architectures, specifically designing stateful agentic systems that move well beyond basic RAG prototypes
- Possess a strong technical foundation in Python, backend API development, and the Google Cloud Platform ecosystem (Vertex AI, BigQuery, ADK)
- You are highly analytical and quality-driven, possessing practical experience mapping specific business criteria to technical evaluation metrics (e.g., factual accuracy, hallucination detection, contextual precision) to ensure AI outputs align with real-world needs
- You understand the critical balance between automated evaluation scoring and human-in-the-loop review, and know how to partner with domain experts to label edge cases and policy constraints
- A strategic thinker who embodies our "Learn fast" and "Craft smart" values, capable of toggling between big-picture commercial strategy and hands-on, detail-oriented engineering
- Have a solid knowledge of statistical concepts and their applications
- Have great communication skills with a passion for data storytelling
- Experience developing and implementing CI/CD practices, containerisation and comprehensive unit testing
Desirable
- Experience running discovery workshops, stakeholder interviews, or utilizing process mapping techniques to dissect complex business workflows before writing a single line of code
- A track record of successfully transitioning internal AI systems into business-critical operations, managing user adoption and change management alongside the technical deployment
- Practical familiarity with modern LLM observability and tracing tools (such as LangSmith, Braintrust, or Latitude) to monitor complex agent decisions and link production failures back to evaluation datasets
- Previous experience operating within a dynamic retail, supply chain, or e-commerce environment
- Experience setting up, hosting and maintaining LLM models, including fine-tuning these models
- PhD or MSc in relevant field
Benefits & conditions
Pulled from the full job description
- Shuttle service provided
- Company pension
- Private medical insurance
- Discounted gym membership
- Car scheme