Data Engineer

Lunio

Charing Cross, United Kingdom

3 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Compensation

£ 90K

Job location

Charing Cross, United Kingdom

Tech stack

Query Performance

Amazon Web Services (AWS)

Data analysis

Cloud Computing

Data Infrastructure

Data Systems

Data Warehousing

Software Debugging

Identity and Access Management

Python

Machine Learning

Node.js

Performance Tuning

SQL Databases

Data Classification

Infrastructure Automation Frameworks

Storage Technologies

Machine Learning Operations

Terraform

Data Pipelines

Job description

As a Data Engineer at Lunio you will play a critical role in building, optimising, and maintaining the robust data infrastructure and scalable data pipelines that power machine learning, MLOps, analytics, and product intelligence at Lunio. Responsible for building reliable, scalable, and observable production-grade data systems with clear SLAs, data freshness guarantees, and monitoring standards.

You will design and operate batch and streaming pipelines that produce reliable, ML-ready datasets for product, analytics, and inference use cases. Owning end-to-end data pipeline development, the design of clean and well-governed data assets that accelerate ML development, and the reliability, observability, and cost-efficiency of the platform in production., * Data Pipeline Engineering: Own the development and optimisation of data pipelines (batch and streaming) that reliably ingest and transform high-volume clickstream and external data.

Data Warehouse Ownership: Own the implementation and performance optimisation of our multi-node, TB-scale Redshift data warehouse, including data modelling, storage design, and cost-efficient query performance.
ML-Ready Data Asset Delivery: Own the delivery of curated, versioned, ML-ready data assets, ensuring consistency, usability, and alignment with downstream use cases.
Reliability & Observability: Own pipeline reliability by defining SLAs, ensuring data freshness, and implementing monitoring, data quality validation, observability, alerting, and structured incident response processes.
ML Pipelines: Implement and operationalise feature computation workflows, including the development and maintenance of feature store infrastructure, to support model training and inference in collaboration with Data Science and Cloud/Platform teams.

Requirements

Proven track record of building, and operating production-grade data pipelines and data warehouse solutions in high-scale environments.
Proven ownership of pipeline reliability, SLAs, monitoring, and incident debugging in production systems.
Deep proficiency in Python and SQL, working with large-scale event data.
Strong hands-on experience with AWS data infrastructure (S3, Redshift, Glue, Kinesis, Lambda, etc.), including performance and cost optimisation.
Experience building streaming/event-driven data pipelines (e.g., Kinesis or similar technologies).

You'll really thrive in this role if you have:

Experience in high-volume event data environments (adtech, fraud, cybersecurity).
Experience collaborating with infrastructure or platform teams and working with infrastructure-as-code tools (e.g., Terraform).
Familiarity with operating in security- and compliance-aware environments (e.g., SOC2, ISO, IAM best practices, data classification).