Data Engineer

CUBE

Charing Cross, United Kingdom

8 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Charing Cross, United Kingdom

Tech stack

Artificial Intelligence

Airflow

Azure

Big Data

Continuous Integration

Data as a Services

Data Architecture

Data Validation

Information Engineering

Data Governance

Data Infrastructure

ETL

Data Mapping

Data Transformation

Data Migration

Entity Relationship Models

Python

Standard Sql

Systems Integration

Workflow Management Systems

Cloud Platform System

Feature Engineering

Azure

Spark

Build Management

Data Lake

Semi-structured Data

Production Code

Terraform

Azure

Software Version Control

Data Pipelines

Job description

We're looking for a Data Engineer to join our Data and AI Engineering team and help build the pipelines, transformations, and infrastructure that power CUBE's regulatory intelligence platform.

This is hands-on engineering work with real scope. You'll be designing and building data pipelines that ingest, process, and serve complex regulatory content-turning unstructured source data into clean, governed, AI-ready assets. Your work will sit at the intersection of data infrastructure and product capability, directly enabling the analytical and AI workloads that define what CUBE does.

You'll be working in an Azure-native environment, collaborating closely with data architects, platform engineers, and AI/ML teams. We're building modern, scalable infrastructure-and we want engineers who care about doing it properly.

We're a post-acquisition business integrating multiple platforms, which means there's genuine complexity to work through-and genuine opportunity to shape how things get built. If you want greenfield work alongside legacy reality, this is it.

Responsibilities

Design and build data pipelines - Build, maintain, and optimise data pipelines that ingest, transform, and deliver structured and unstructured regulatory content across our platform estate.
Transform and model data - Apply transformation logic that converts raw source data into clean, reliable, semantically consistent assets ready for analytics and AI consumption.
Implement data quality and observability practices - Instrument pipelines with monitoring, alerting, and data quality checks that catch problems early and maintain platform trust.
Collaborate with architects and platform engineers - Work closely with the Principal Data Architect and Head of Data Platform to implement patterns that align with our architectural direction.
Support integration and migration work - Contribute to source-to-target mapping and pipeline development for ongoing platform consolidation.
Champion engineering best practices - Write code that others can maintain: version-controlled, tested, documented, and built for production.
Contribute to platform scalability and cost efficiency - Identify and resolve performance bottlenecks, redundancies, and inefficiencies in existing pipeline infrastructure.
Build for AI readiness - Understand how downstream AI/ML workloads consume data and design pipelines that support feature engineering, model training, and inference requirements.

Requirements

Core

3+ years of experience in data engineering or a closely related role.
Strong SQL and Python skills-you write production-quality code, not just scripts.
Hands-on experience building and maintaining data pipelines in cloud environments.
Familiarity with ETL/ELT patterns, orchestration tools (e.g. Apache Airflow, dbt, Azure Data Factory), and data transformation frameworks.
Experience working with both structured and unstructured or semi-structured data.
Understanding of data quality principles-you know what a bad pipeline looks like and how to fix it.
Comfort with version control, CI/CD practices, and engineering-grade delivery.

Preferred

Experience with Microsoft Azure data services - Azure Data Factory, Synapse Analytics, Data Lake Storage, Fabric.
Familiarity with Apache Spark for large-scale data processing.
Exposure to data modelling concepts - normalisation, dimensional design, entity-relationship patterns.
Background in platform integration, data migration, or M&A consolidation work.
Experience building pipelines that support AI/ML workloads, including feature stores or model training infrastructure.
Knowledge of data governance practices - lineage, cataloguing, access control, compliance.
Familiarity with infrastructure-as-code tooling (e.g. Terraform).
Exposure to regulatory, financial services, or compliance data domains.

Mindset

You care about the quality of your output - not just whether the pipeline runs, but whether it's maintainable, observable, and trustworthy.
You're comfortable working with ambiguity and systems that weren't built the way you'd have built them.
You communicate clearly with both engineers and non-engineers.
You take ownership - when something breaks, you fix it; when something could be better, you say so., If you are passionate about leveraging technology to transform regulatory compliance and meet the qualifications outlined above, we invite you to apply. Please submit your resume detailing your relevant experience and interest in CUBE.

About the company

CUBE are a global RegTech business defining and implementing the gold standard of regulatory intelligence for the financial services industry. We deliver our services through intuitive SaaS solutions, powered by AI, to simplify the complex and everchanging world of compliance for our clients. Why us? CUBE is a globally recognized brand at the forefront of Regulatory Technology. Our industry-leading SaaS solutions are trusted by the world's top financial institutions globally. In 2024, we achieved over 50% growth, both organically and through two strategic acquisitions. We're a fast-paced, high-performing team that thrives on pushing boundaries-continuously evolving our products, services, and operations. At CUBE, we don't just keep up we stay ahead. We believe our future is built by bold, ambitious individuals who are driven to make a real difference. Our "make it happen" culture empowers you to take ownership of your career and accelerate your personal and professional development from day one. With over 700 CUBERs across 19 countries spanning EMEA, the Americas, and APAC, we operate as one team with a shared mission to transform regulatory compliance. Diversity, collaboration, and purpose are the heartbeat of our success. We were among the first to harness the power of AI in regulatory intelligence, and we continue to lead with our cutting-edge technology. At CUBE, You will work alongside some of the brightest minds in AI research and engineering in developing impactful solutions that are reshaping the world of regulatory compliance.