Data Engineer
Role details
Job location
Tech stack
Job description
We're looking for a Data Engineer to join our Data and AI Engineering team and help build the pipelines, transformations, and infrastructure that power CUBE's regulatory intelligence platform.
This is hands-on engineering work with real scope. You'll be designing and building data pipelines that ingest, process, and serve complex regulatory content-turning unstructured source data into clean, governed, AI-ready assets. Your work will sit at the intersection of data infrastructure and product capability, directly enabling the analytical and AI workloads that define what CUBE does.
You'll be working in an Azure-native environment, collaborating closely with data architects, platform engineers, and AI/ML teams. We're building modern, scalable infrastructure-and we want engineers who care about doing it properly.
We're a post-acquisition business integrating multiple platforms, which means there's genuine complexity to work through-and genuine opportunity to shape how things get built. If you want greenfield work alongside legacy reality, this is it.
Responsibilities
- Design and build data pipelines - Build, maintain, and optimise data pipelines that ingest, transform, and deliver structured and unstructured regulatory content across our platform estate.
- Transform and model data - Apply transformation logic that converts raw source data into clean, reliable, semantically consistent assets ready for analytics and AI consumption.
- Implement data quality and observability practices - Instrument pipelines with monitoring, alerting, and data quality checks that catch problems early and maintain platform trust.
- Collaborate with architects and platform engineers - Work closely with the Principal Data Architect and Head of Data Platform to implement patterns that align with our architectural direction.
- Support integration and migration work - Contribute to source-to-target mapping and pipeline development for ongoing platform consolidation.
- Champion engineering best practices - Write code that others can maintain: version-controlled, tested, documented, and built for production.
- Contribute to platform scalability and cost efficiency - Identify and resolve performance bottlenecks, redundancies, and inefficiencies in existing pipeline infrastructure.
- Build for AI readiness - Understand how downstream AI/ML workloads consume data and design pipelines that support feature engineering, model training, and inference requirements.
Requirements
Core
- 3+ years of experience in data engineering or a closely related role.
- Strong SQL and Python skills-you write production-quality code, not just scripts.
- Hands-on experience building and maintaining data pipelines in cloud environments.
- Familiarity with ETL/ELT patterns, orchestration tools (e.g. Apache Airflow, dbt, Azure Data Factory), and data transformation frameworks.
- Experience working with both structured and unstructured or semi-structured data.
- Understanding of data quality principles-you know what a bad pipeline looks like and how to fix it.
- Comfort with version control, CI/CD practices, and engineering-grade delivery.
Preferred
- Experience with Microsoft Azure data services - Azure Data Factory, Synapse Analytics, Data Lake Storage, Fabric.
- Familiarity with Apache Spark for large-scale data processing.
- Exposure to data modelling concepts - normalisation, dimensional design, entity-relationship patterns.
- Background in platform integration, data migration, or M&A consolidation work.
- Experience building pipelines that support AI/ML workloads, including feature stores or model training infrastructure.
- Knowledge of data governance practices - lineage, cataloguing, access control, compliance.
- Familiarity with infrastructure-as-code tooling (e.g. Terraform).
- Exposure to regulatory, financial services, or compliance data domains.
Mindset
- You care about the quality of your output - not just whether the pipeline runs, but whether it's maintainable, observable, and trustworthy.
- You're comfortable working with ambiguity and systems that weren't built the way you'd have built them.
- You communicate clearly with both engineers and non-engineers.
- You take ownership - when something breaks, you fix it; when something could be better, you say so., If you are passionate about leveraging technology to transform regulatory compliance and meet the qualifications outlined above, we invite you to apply. Please submit your resume detailing your relevant experience and interest in CUBE.