AI Data Engineer

Zenith UK
Charing Cross, United Kingdom
2 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 160K

Job location

Charing Cross, United Kingdom

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Azure
Big Data
Cloud Computing
Computer Programming
Databases
Data Validation
Information Engineering
Data Files
Data Governance
Data Infrastructure
ETL
Data Systems
Data Warehousing
Python
Machine Learning
TensorFlow
SQL Databases
Data Streaming
Management of Software Versions
Cloud Platform System
Feature Engineering
PyTorch
Large Language Models
Spark
Generative AI
Build Management
Containerization
Kubernetes
Information Technology
HuggingFace
Real Time Data
Kafka
Machine Learning Operations
Data Pipelines
Docker

Job description

As a Senior Data Engineer for AI/ML, you will be the architect and builder of the data infrastructure that feeds our intelligent systems. Your responsibilities will include:

  • Design and Build Scalable Data Pipelines: Architect, implement, and optimize robust, high-performance real-time and batch ETL pipelines to ingest, process, and transform massive datasets for LLMs and foundational AI models.
  • Cloud-Native Innovation: Leverage your deep expertise across AWS, Azure, and/or GCP to build cloud-native data solutions, ensuring efficiency, scalability, and cost-effectiveness.
  • Power Generative AI: Develop and manage specialized data flows for generative AI applications, including integrating with vector databases and constructing sophisticated RAG pipelines.
  • Champion Data Governance & Ethical AI: Implement best practices for data quality, lineage, privacy, and security, ensuring our AI systems are developed and used responsibly and ethically.
  • Tooling the Future: Get hands-on with cutting-edge technologies like Hugging Face, PyTorch, TensorFlow, Apache Spark, Apache Airflow, and other modern data and ML frameworks.
  • Collaborate and Lead: Partner closely with ML Engineers, Data Scientists, and Researchers to understand their data needs, provide technical leadership, and translate complex requirements into actionable data strategies.
  • Optimize and Operate: Monitor, troubleshoot, and continuously optimize data pipelines and infrastructure for peak performance and reliability in production environments.

Requirements

  • Extensive Data Engineering Experience: Proven track record (3+ years) in designing, building, and maintaining large-scale data pipelines and data warehousing solutions.
  • Cloud Platform Mastery: Expert-level proficiency with at least one major cloud provider (GCP-Preferred, AWS, or Azure), including their data, compute, and storage services.
  • Programming Prowess: Strong programming skills in Python and SQL are essential.
  • Big Data Ecosystem Expertise: Hands-on experience with big data technologies like Apache Spark, Kafka, and data orchestration tools such as Apache Airflow or Prefect.
  • ML Data Acumen: Solid understanding of data requirements for machine learning models, including feature engineering, data validation, and dataset versioning.
  • Vector Database Experience: Practical experience working with vector databases (e.g., Pinecone, Milvus, Chroma) for embedding storage and retrieval.
  • Generative AI Familiarity: Understanding of data paradigms for LLMs, RAG architectures, and how data pipelines support fine-tuning or pre-training.
  • MLOps Principles: Familiarity with MLOps best practices for deploying and managing ML models in production.
  • Data Governance & Ethics: Experience implementing data governance frameworks, ensuring data quality, privacy, and compliance, with an awareness of ethical AI considerations.

Bonus Points If You Have:

  • Direct experience with Hugging Face ecosystem, PyTorch, or TensorFlow for data preparation in an ML context.
  • Experience with real-time data streaming architectures.
  • Familiarity with containerization (Docker, Kubernetes).
  • Master's or Ph.D. in Computer Science, Data Engineering, or a related quantitative field.

Benefits & conditions

Job Description Data & AI Product Lead London - Hybrid - PermanentUp to £110,000 VIQU have partnered with a leading insurance organisation seeking a Data & AI Product Lead to shape and drive their UK&I data and AI product strategy. As a Data & AI Product Lead, you will..., Mid/Senior/Staff Backend Engineer | Full stack, Data/AI | Greentech, B2B Logistics | Recent Series A Raise | Salary up to £160,000 + Equity, Bonus, Benefits | London, Hybrid (4 days PW)

Owen Thomas | Pending B Corp

Mid/Senior/Staff Backend Engineer | Full stack, Data/AI | Greentech, B2B Logistics | Recent Series A Raise | Salary up to £160,000 + Equity, Bonus, Benefits | London, Hybrid (4 days PW) The Company We are working with a Series A backed scale-up, that has raised over $...

About the company

At Zenith UK, we believe that fostering an inclusive culture where all talent can thrive makes our company stronger. It enables a greater idea exchange, which fosters innovation and creativity, and enriches our perspective.We are committed to Publicis Groupe's wide variety of talent engagement and inclusion programming, and encourage our people to take an active role in continuing to drive positive change within our agency. This role presents an opportunity to engage deeply with MLOps, vector databases, and Retrieval-Augmented Generation (RAG) pipelines. If you are passionate about shaping the future of AI and thrive on complex, high-impact challenges, we encourage you to apply., Tribal Recruitment - The UK's Leading Digital & R2R Recruiters

Apply for this position