Sr. AI Data Engineer (UK Remote)
Role details
Job location
Tech stack
Job description
AI and data science are integral to our success and ambitious product roadmap, and great AI begins with great data. Joining us as a Senior AI Data Engineer means you'll become part of a global team of proactive, supportive, and independent professionals committed to delivering sophisticated, well-structured AI and data systems. You'll help pioneer our next generation data and AI pipelines to scale our team's impact. Additionally, you'll collaborate with different teams within Turnitin to integrate AI and data science across a broad suite of products, designed to enhance learning, teaching, and academic integrity., Your role as a Senior AI Data Engineer encompasses the following key responsibilities:
- AI Data Infrastructure & Pipeline Management for Applied AI: Design, build, and operate scalable real-time data pipelines that support ongoing Applied AI model training. Deploy and maintain robust data infrastructure using AI techniques and engineering best practices to ensure continuous model improvement and deployment cycles.
- Data Collection: Execute initiatives for collecting, normalizing, and storing data across multiple sources, including external LLM providers.
- Collaboration: Partner with AI R&D, Applied AI, and Data Platform teams to ensure seamless data flow and quality standards. Partner with stakeholders to collect, curate, and catalog high-quality datasets that directly support Applied AI retraining workflows and business objectives.
- AI R&D Support: Provide secondary support to AI Research & Development efforts by applying advanced data warehousing and engineering technologies. Contribute to exploratory data initiatives that uncover insights from Turnitin's extensive data resources.
- Communication: Maintain clear communication channels across teams, ensuring alignment with company vision while sharing insights on data infrastructure needs and potential innovations.
- Technology Evolution: Stay current with emerging tools and methodologies in AI data engineering, bringing recommendations to enhance our AI data infrastructure and capabilities.
Requirements
Code, Data Infrastructure, Communication Skills, Edtech, Technical Leadership, Sql, Data Systems, Aws, Python, Natural Language Processing, Infrastructure, Platforms, Data Reporting, Data Engineering, Azure, Data Visualization, Dbt, Computer Vision, * At least 4 years of experience in data engineering, ideally focused on AI/ML data infrastructure or enabling and accelerating AI R&D.
- Strong proficiency in Python, SQL, and Infrastructure as Code (Terraform, CloudFormation), with additional experience in modern orchestration frameworks (Airflow, Prefect, or dbt).
- Proficiency with cloud-native data platforms (AWS, Azure, GCP) and vector databases (Pinecone, Weaviate, Qdrant, or Chroma).
- Experience with MLOps tools and platforms (HuggingFace, SageMaker Bedrock, Vertex AI), experiment tracking (MLflow, Weights & Biases), and model deployment pipelines.
- Experience with Large Language Models (LLMs), embedding generation, retrieval-augmented generation (RAG) systems, and frameworks for orchestrating LLM interaction (LiteLLM, LangFuse, LangChain, LlamaIndex).
- Strong problem-solving, analytical, and communication skills, with the ability to design scalable AI data systems and collaborate effectively in cross-functional teams., * 6+ years of experience in data engineering with a focus on AI and machine learning projects.
- Experience in a technical leadership or mentorship role.
- Experience in education, EdTech, or academic integrity sectors.
- Experience using AI coding tools (Cursor, Claude Code, GitHub Copilot) for accelerated development.
- Familiarity with natural language processing, computer vision, or multimodal AI applications.
- Experience with data visualization (Streamlit) and data reporting.