Data Engineer Collibra | Insurance Domain

Capgemini Engineering
Charing Cross, United Kingdom
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charing Cross, United Kingdom

Tech stack

Agile Methodologies
Azure
Computer Security
Databases
Information Engineering
Data Governance
ETL
Data Transformation
Data Structures
Relational Databases
IBM InfoSphere DataStage
Software Debugging
Distributed Data Store
Ganglia (Software)
Github
Python
Meta-Data Management
Microsoft SQL Server
MongoDB
Neo4j
NoSQL
Oracle Applications
Scrum
Azure
SQL Server Integration Services
Teradata
Unstructured Data
Parquet
Data Processing
Azure
Netezza
Spark
GIT
Data Lake
Collibra
Cassandra
Extreme Programming (XP)
Cosmos DB
Azure
Software Version Control
Data Pipelines
Databricks

Job description

Design and implement scalable data pipelines using Azure Data Factory, Databricks, Synapse Analytics, and Azure Data Lake. Develop and optimize data transformation workflows using Python, R, or Scala on Azure Databricks or Apache Spark. Integrate and manage metadata using Collibra for effective data governance and cataloging. Handle structured, semi-structured, and unstructured data to extract insights and identify linkages across datasets. Lead technical delivery and mentor junior engineers on data engineering best practices. Optimize Spark jobs and debug performance issues using tools like Ganglia UI. Design efficient data structures for storage and querying, including formats like Parquet and Delta Lake. Work across multiple database technologies: RDBMS (MS SQL Server, Oracle), MPP (Teradata, Netezza), and NoSQL (MongoDB, Cassandra, Neo4J, CosmosDB, Gremlin). Ensure secure and compliant data handling aligned with Information Security principles. Collaborate in Agile teams and use Git-based workflows for version control and code management.

Requirements

Minimum 7 years of hands-on experience in Azure Data Engineering. Strong working knowledge of Collibra for data governance and metadata management. Proven experience in the insurance domain is highly desirable. Proficient in Python, R, or Scala for data transformation and analysis. Deep understanding of NoSQL databases and distributed data processing. Experience with traditional ETL tools such as Informatica, IBM Datastage, or Microsoft SSIS. Skilled in working with large and complex codebases using GitHub and Gitflow. Effective communicator with strong stakeholder management capabilities. Familiarity with Agile methodologies including SCRUM, XP, and Kanban. Preferred certifications: Microsoft Certified Azure Data Engineer Associate and Collibra Certified Ranger (or equivalent).

About the company

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and build a more sustainable, more inclusive world., Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world while creating tangible impact for enterprises and society. It is a responsible and diverse group of 350,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market-leading capabilities in AI, cloud, and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of EUR22.5 billion. Get The Future You Want | www.capgemini.co

Apply for this position