Data Engineer Collibra | Insurance Domain

Capgemini Engineering

Charing Cross, United Kingdom

4 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charing Cross, United Kingdom

Tech stack

Agile Methodologies

Azure

Computer Security

Databases

Information Engineering

Data Governance

ETL

Data Transformation

Data Structures

Relational Databases

IBM InfoSphere DataStage

Software Debugging

Distributed Data Store

Ganglia (Software)

Github

Python

Meta-Data Management

Microsoft SQL Server

MongoDB

Neo4j

NoSQL

Oracle Applications

Scrum

Azure

SQL Server Integration Services

Teradata

Unstructured Data

Parquet

Data Processing

Azure

Netezza

Spark

GIT

Data Lake

Collibra

Cassandra

Extreme Programming (XP)

Cosmos DB

Azure

Software Version Control

Data Pipelines

Databricks

Job description

Design and implement scalable data pipelines using Azure Data Factory, Databricks, Synapse Analytics, and Azure Data Lake. Develop and optimize data transformation workflows using Python, R, or Scala on Azure Databricks or Apache Spark. Integrate and manage metadata using Collibra for effective data governance and cataloging. Handle structured, semi-structured, and unstructured data to extract insights and identify linkages across datasets. Lead technical delivery and mentor junior engineers on data engineering best practices. Optimize Spark jobs and debug performance issues using tools like Ganglia UI. Design efficient data structures for storage and querying, including formats like Parquet and Delta Lake. Work across multiple database technologies: RDBMS (MS SQL Server, Oracle), MPP (Teradata, Netezza), and NoSQL (MongoDB, Cassandra, Neo4J, CosmosDB, Gremlin). Ensure secure and compliant data handling aligned with Information Security principles. Collaborate in Agile teams and use Git-based workflows for version control and code management.

Requirements

Minimum 7 years of hands-on experience in Azure Data Engineering. Strong working knowledge of Collibra for data governance and metadata management. Proven experience in the insurance domain is highly desirable. Proficient in Python, R, or Scala for data transformation and analysis. Deep understanding of NoSQL databases and distributed data processing. Experience with traditional ETL tools such as Informatica, IBM Datastage, or Microsoft SSIS. Skilled in working with large and complex codebases using GitHub and Gitflow. Effective communicator with strong stakeholder management capabilities. Familiarity with Agile methodologies including SCRUM, XP, and Kanban. Preferred certifications: Microsoft Certified Azure Data Engineer Associate and Collibra Certified Ranger (or equivalent).

About the company

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world., Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world while creating tangible impact for enterprises and society. It is a responsible and diverse group of 350,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market-leading capabilities in AI, cloud, and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of EUR22.5 billion. Get The Future You Want | www.capgemini.co