Hadoop Spark Data Engineer
Role details
Job location
Tech stack
Job description
- Build operate monitor and troubleshoot Hadoop clusters.
- Write scalable ETL processes using tools like Hive Pig and Spark.
- Develop and maintain data ingestion processes using Sqoop Flume or Kafka.
- Optimize MapReduce jobs and manage HDFS storage.
- Collaborate with data scientists and analysts to support data needs.
- Ensure data security and compliance with organizational policies.
- Create and maintain technical documentation and playbooks.
- Evaluate and integrate cloudbased big data solutions AWS GCP Azure.
Requirements
Do you have experience in Spark?, Results-driven Hadoop Spark Data Engineer with strong expertise in designing and implementing scalable big data solutions using Scala and Apache Spark. Experienced in working with the Hadoop ecosystem, including HDFS, Hive, and YARN, to process and analyze large datasets efficiently. Skilled in building robust ETL pipelines, real-time data processing, and optimizing distributed systems for performance and reliability. Proficient in SQL, data modeling, and integrating data from multiple sources in cloud and on-prem environments., * Proficient in Scala programming with strong expertise in functional programming concepts for building scalable data applications.
- Extensive experience in Apache Spark (Core, SQL, and Streaming) for processing large-scale distributed data efficiently
- Strong knowledge of Hadoop ecosystem components including HDFS, YARN, Hive, and HBase.
- Skilled in designing and developing ETL pipelines and handling structured and unstructured big data.
- Experienced in performance tuning, data optimization, and working with distributed systems in cloud or on-prem environments.
We are a Disability Confident Employer:
Capgemini is proud to be a Disability Confident Employer (Level 2) under the UK Government's Disability Confident scheme. As part of our commitment to inclusive recruitment, we will offer an interview to all candidates who:
- Declare they have a disability, and
- Meet the minimum essential criteria for the role.
About the company
Capgemini ist einer der weltweit führenden Anbieter von Management- und IT-Beratung, Technologie-Services und Digitaler Transformation. Als ein Wegbereiter für Innovation unterstützt das Unternehmen seine Kunden bei deren komplexen Herausforderungen rund um Cloud, Digital und Plattformen.