Lead Data Engineer (Databricks)
Role details
Job location
Tech stack
Job description
We're looking for a Lead Data Engineer to take technical ownership of modern, scalable data platforms that turn complex environmental and telemetry data into trusted insights for regulators, scientists, and operational teams. What you'll be doing
As the Lead Data Engineer on our River Water Quality Programme, you'll design, build, and optimise Databricks-based data pipelines that enable timely, reliable and well-governed data across the organisation.
You'll:
- Design and develop scalable batch and streaming data pipelines using Databricks, Spark and Delta Lake
- Implement medallion architecture (bronze/silver/gold) to ensure data quality, lineage and efficient consumption
- Optimise Spark jobs and workflows to meet near real-time monitoring SLAs
- Build and maintain streaming pipelines using Structured Streaming and Delta Live Tables
- Ensure strong data governance, including schema enforcement, validation rules and ACID compliance
- Orchestrate and automate pipelines using Databricks Jobs, APIs or cloud-native schedulers
- Collaborate with data scientists, analysts and environmental specialists to deliver high-quality, well-documented datasets
- Monitor, troubleshoot and continuously improve production pipelines for reliability, performance and cost efficiency
- Champion best practices across CI/CD, testing, version control and documentation
Act as a technical leader and mentor, raising standards across the data engineering community What you'll bring
Requirements
- Deep expertise in Databricks (Spark / PySpark, Delta Lake, Unity Catalog, Workflows)
- Strong experience building ETL/ELT pipelines, both batch and streaming
- Advanced Python and SQL skills
- Hands-on experience with cloud platforms (Azure, AWS or GCP)
- Solid understanding of data modelling and modern architectures
- Strong knowledge of data governance, security and compliance
Desirable
- Experience with orchestration tools such as Airflow
- Familiarity with CI/CD pipelines, Git workflows and automated testing
- Exposure to real-time, IoT or environmental telemetry data
Experience & Qualifications
- Bachelor's degree in Computer Science, Data Engineering or a related discipline
- 5+ years' experience in data engineering roles, including 3+ years with Databricks
Proven ability to lead technically and mentor engineers What you'll get