Databricks Data Engineer

Capgemini

Manchester, United Kingdom

4 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Manchester, United Kingdom

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Automation of Tests

Unit Testing

Azure

Code Review

Continuous Integration

Information Engineering

Data Transformation

Data Security

Data Systems

Data Warehousing

DevOps

Python

Azure

Software Engineering

SQL Databases

Data Logging

Google Cloud Platform

Azure

Snowflake

Spark

GIT

SC Clearance

Data Lake

Data Management

Machine Learning Operations

Software Version Control

Data Pipelines

Databricks

Job description

As a Databricks Data Engineer at Capgemini, you will design, build and operate reliable, scalable data pipelines and lakehouse solutions that enable analytics, AI and GenAI. You will work hands-on with Databricks, Apache Spark and the Azure data ecosystem to ingest, transform and serve trusted data products, taking ownership from development through to production support and continuous improvement.

You will be part of the Data Platforms team that is part of the Insights and Data Global Practice that has seen strong growth and continued success across a variety of projects and sectors. Data Platforms is the home of the Data Engineers, Platform Engineers, Solutions Architects and Business Analysts who are focused on driving our customers digital and data transformation journey using the modern cloud platforms. We specialise on using the latest frameworks, reference architectures and technologies using AWS, Azure and GCP along with various data platforms like Databricks and Snowflake

Please Note: Security Clearance: To be successfully appointed to this role, must be eligible to obtain Security Check (SC)clearance. To obtain SC clearance, the successful applicant must have resided continuously within the United Kingdom for the last 5 years, along with other criteria and requirements.

Throughout the recruitment process, you will be asked questions about your security clearance eligibility such as, but not limited to, country of residence and nationality. Some posts are restricted to sole UK Nationals for security reasons; therefore, you may be asked about your citizenship in the application process.

The Focus Of Your Role

As a Databricks Data Engineer with an Azure and Databricks focus, you will be an integral part of our team dedicated to building scalable and secure data platforms. You will leverage Databricks, Apache Spark and Azure services to develop and optimise batch and streaming pipelines, implement Delta Lake lakehouse patterns, and deliver well-governed data sets that power reporting, analytics and AI/ML use cases.

Build and maintain data pipelines and lakehouse solutions: Use Databricks and Apache Spark to ingest, transform and curate data in Azure Data Lake Storage (ADLS) and Delta Lake.
Implement data modelling, quality and governance: Develop scalable data models, apply validation/quality checks, and follow governance practices to ensure reliable and auditable data products.
Enable AI/ML and analytics use cases: Prepare curated datasets and features, collaborate with data scientists, and integrate pipelines with ML workflows (e.g., MLflow) where required.
Monitor and optimise jobs and clusters: Tune Spark performance, improve reliability, manage costs, and implement observability (logging, alerting, SLAs) for production workloads.
Collaborate across teams: Work with business analysts, platform engineers, data scientists and DevOps to deliver secure, well-tested data solutions in an agile environment.
Apply engineering best practices: Use version control, code review, automated testing and CI/CD; keep current with Databricks capabilities and data engineering patterns.
Be a Databricks advocate: Share knowledge, contribute to accelerators and standards, and pursue Databricks certification/champion pathways.

What You'll Bring

You will bring strong, hands-on experience delivering modern data engineering solutions within complex environments, with an understanding of regulatory obligations and the need for trusted, mission critical data. You will be able to build secure, scalable and well governed pipelines and lakehouse data products that support analytics, AI and GenAI while meeting requirements for data protection, sovereignty, transparency and auditability.

You will be comfortable collaborating with stakeholders to translate outcomes into robust data solutions, and you will bring strong engineering discipline, documentation and teamwork skills to help teams deliver sustainable data capabilities.

Requirements

Minimum 5+ years of experience as a Data Engineer, including hands-on delivery of Databricks solutions in production environments.
Strong expertise in Databricks, Apache Spark and Delta Lake, with good understanding of lakehouse and data warehousing concepts.
Experience with Microsoft Azure, including ADLS Gen2, Azure Databricks, and orchestration tooling such as Azure Data Factory (or similar).
Proficiency in Python and SQL, with strong software engineering practices (Git, code review, unit testing, CI/CD) and an ability to troubleshoot production issues.
A continuous learning mindset, ideally with progress toward Databricks certification (e.g., Data Engineer Associate/Professional) or equivalent experience.
Relevant certifications (desirable): Databricks Data Engineer and/or Microsoft Azure data certifications.

About the company

Capgemini ist einer der weltweit führenden Anbieter von Management- und IT-Beratung, Technologie-Services und Digitaler Transformation. Als ein Wegbereiter für Innovation unterstützt das Unternehmen seine Kunden bei deren komplexen Herausforderungen rund um Cloud, Digital und Plattformen.