Databricks Data Engineer

Capgemini
Manchester, United Kingdom
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Manchester, United Kingdom

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Automation of Tests
Unit Testing
Azure
Code Review
Continuous Integration
Information Engineering
Data Transformation
Data Security
Data Systems
Data Warehousing
DevOps
Python
Azure
Software Engineering
SQL Databases
Data Logging
Google Cloud Platform
Azure
Snowflake
Spark
GIT
SC Clearance
Data Lake
Data Management
Machine Learning Operations
Software Version Control
Data Pipelines
Databricks

Job description

As a Databricks Data Engineer at Capgemini, you will design, build and operate reliable, scalable data pipelines and lakehouse solutions that enable analytics, AI and GenAI. You will work hands-on with Databricks, Apache Spark and the Azure data ecosystem to ingest, transform and serve trusted data products, taking ownership from development through to production support and continuous improvement.

You will be part of the Data Platforms team that is part of the Insights and Data Global Practice that has seen strong growth and continued success across a variety of projects and sectors. Data Platforms is the home of the Data Engineers, Platform Engineers, Solutions Architects and Business Analysts who are focused on driving our customers digital and data transformation journey using the modern cloud platforms. We specialise on using the latest frameworks, reference architectures and technologies using AWS, Azure and GCP along with various data platforms like Databricks and Snowflake

Please Note: Security Clearance: To be successfully appointed to this role, must be eligible to obtain Security Check (SC)clearance. To obtain SC clearance, the successful applicant must have resided continuously within the United Kingdom for the last 5 years, along with other criteria and requirements.

Throughout the recruitment process, you will be asked questions about your security clearance eligibility such as, but not limited to, country of residence and nationality. Some posts are restricted to sole UK Nationals for security reasons; therefore, you may be asked about your citizenship in the application process.

The Focus Of Your Role

As a Databricks Data Engineer with an Azure and Databricks focus, you will be an integral part of our team dedicated to building scalable and secure data platforms. You will leverage Databricks, Apache Spark and Azure services to develop and optimise batch and streaming pipelines, implement Delta Lake lakehouse patterns, and deliver well-governed data sets that power reporting, analytics and AI/ML use cases.

  • Build and maintain data pipelines and lakehouse solutions: Use Databricks and Apache Spark to ingest, transform and curate data in Azure Data Lake Storage (ADLS) and Delta Lake.
  • Implement data modelling, quality and governance: Develop scalable data models, apply validation/quality checks, and follow governance practices to ensure reliable and auditable data products.
  • Enable AI/ML and analytics use cases: Prepare curated datasets and features, collaborate with data scientists, and integrate pipelines with ML workflows (e.g., MLflow) where required.
  • Monitor and optimise jobs and clusters: Tune Spark performance, improve reliability, manage costs, and implement observability (logging, alerting, SLAs) for production workloads.
  • Collaborate across teams: Work with business analysts, platform engineers, data scientists and DevOps to deliver secure, well-tested data solutions in an agile environment.
  • Apply engineering best practices: Use version control, code review, automated testing and CI/CD; keep current with Databricks capabilities and data engineering patterns.
  • Be a Databricks advocate: Share knowledge, contribute to accelerators and standards, and pursue Databricks certification/champion pathways.

What You'll Bring

You will bring strong, hands-on experience delivering modern data engineering solutions within complex environments, with an understanding of regulatory obligations and the need for trusted, mission critical data. You will be able to build secure, scalable and well governed pipelines and lakehouse data products that support analytics, AI and GenAI while meeting requirements for data protection, sovereignty, transparency and auditability.

You will be comfortable collaborating with stakeholders to translate outcomes into robust data solutions, and you will bring strong engineering discipline, documentation and teamwork skills to help teams deliver sustainable data capabilities.

Requirements

  • Minimum 5+ years of experience as a Data Engineer, including hands-on delivery of Databricks solutions in production environments.
  • Strong expertise in Databricks, Apache Spark and Delta Lake, with good understanding of lakehouse and data warehousing concepts.
  • Experience with Microsoft Azure, including ADLS Gen2, Azure Databricks, and orchestration tooling such as Azure Data Factory (or similar).
  • Proficiency in Python and SQL, with strong software engineering practices (Git, code review, unit testing, CI/CD) and an ability to troubleshoot production issues.
  • A continuous learning mindset, ideally with progress toward Databricks certification (e.g., Data Engineer Associate/Professional) or equivalent experience.
  • Relevant certifications (desirable): Databricks Data Engineer and/or Microsoft Azure data certifications.

About the company

Capgemini ist einer der weltweit führenden Anbieter von Management- und IT-Beratung, Technologie-Services und Digitaler Transformation. Als ein Wegbereiter für Innovation unterstützt das Unternehmen seine Kunden bei deren komplexen Herausforderungen rund um Cloud, Digital und Plattformen.

Apply for this position