Data Engineer

Lunio
Charing Cross, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 90K

Job location

Charing Cross, United Kingdom

Tech stack

Query Performance
Amazon Web Services (AWS)
Data analysis
Cloud Computing
Data Infrastructure
Data Systems
Data Warehousing
Software Debugging
Identity and Access Management
Python
Machine Learning
Node.js
Performance Tuning
SQL Databases
Data Classification
Infrastructure Automation Frameworks
Storage Technologies
Machine Learning Operations
Terraform
Data Pipelines

Job description

As a Data Engineer at Lunio you will play a critical role in building, optimising, and maintaining the robust data infrastructure and scalable data pipelines that power machine learning, MLOps, analytics, and product intelligence at Lunio. Responsible for building reliable, scalable, and observable production-grade data systems with clear SLAs, data freshness guarantees, and monitoring standards.

You will design and operate batch and streaming pipelines that produce reliable, ML-ready datasets for product, analytics, and inference use cases. Owning end-to-end data pipeline development, the design of clean and well-governed data assets that accelerate ML development, and the reliability, observability, and cost-efficiency of the platform in production., * Data Pipeline Engineering: Own the development and optimisation of data pipelines (batch and streaming) that reliably ingest and transform high-volume clickstream and external data.

  • Data Warehouse Ownership: Own the implementation and performance optimisation of our multi-node, TB-scale Redshift data warehouse, including data modelling, storage design, and cost-efficient query performance.
  • ML-Ready Data Asset Delivery: Own the delivery of curated, versioned, ML-ready data assets, ensuring consistency, usability, and alignment with downstream use cases.
  • Reliability & Observability: Own pipeline reliability by defining SLAs, ensuring data freshness, and implementing monitoring, data quality validation, observability, alerting, and structured incident response processes.
  • ML Pipelines: Implement and operationalise feature computation workflows, including the development and maintenance of feature store infrastructure, to support model training and inference in collaboration with Data Science and Cloud/Platform teams.

Requirements

  • Proven track record of building, and operating production-grade data pipelines and data warehouse solutions in high-scale environments.
  • Proven ownership of pipeline reliability, SLAs, monitoring, and incident debugging in production systems.
  • Deep proficiency in Python and SQL, working with large-scale event data.
  • Strong hands-on experience with AWS data infrastructure (S3, Redshift, Glue, Kinesis, Lambda, etc.), including performance and cost optimisation.
  • Experience building streaming/event-driven data pipelines (e.g., Kinesis or similar technologies).

You'll really thrive in this role if you have:

  • Experience in high-volume event data environments (adtech, fraud, cybersecurity).
  • Experience collaborating with infrastructure or platform teams and working with infrastructure-as-code tools (e.g., Terraform).
  • Familiarity with operating in security- and compliance-aware environments (e.g., SOC2, ISO, IAM best practices, data classification).

Apply for this position