Senior Full-Stack Engineer, Data platforms (GCP) H/F - IBM Client Innovation Center

IBM
Canton of Colombes-1, France
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Canton of Colombes-1, France

Tech stack

Airflow
BigTable
Google BigQuery
Cloud Computing
Cloud Storage
Data Infrastructure
Data Migration
Data Systems
Data Warehousing
Data Flow Control
Hadoop
Python
Open Source Technology
Cloud Services
Cloudera
Google Cloud Platform
Spark
Data Layers
Build Management
Data Lake
Data Management
Stream Processing
Data Pipelines
Apache Beam

Job description

As a Data Engineer specializing in Google's data platforms, you will design, build, and maintain data engineering solutions on Google's Cloud ecosystem. You will utilize various Google services to develop batch and real-time data pipelines, perform data migration, and design data layers.

Your primary responsibilities will include:

  • Design Data Pipelines: Design and build data engineering solutions using Google services such as DataProc, DataFlow, PubSub, BigQuery, Big Table, Cloud Spanner, CloudSQL, and AlloyDB for batch and real-time data processing.

  • Develop Data Migration: Develop and manage batch and real-time data pipelines for Data Warehouse and Datalake, ensuring efficient data migration and integration.

  • Manage Data Platform: Schedule and manage the data platform using Google Cloud Scheduler and Cloud Composer (Airflow), ensuring seamless data workflow and pipeline management.

  • Implement Data Solutions: Implement data engineering solutions using Google Cloud Storage, BigTable, BigQuery DataProc with Spark and Hadoop, Google DataFlow with Apache Beam or Python, and other open-source technologies.

  • Optimize Data Pipelines: Optimize and maintain data pipelines for efficiency, scalability, and reliability, ensuring high-quality data output.

Requirements

  • Exposure to Google Cloud Ecosystem: Familiarity with designing, building, and maintaining data engineering solutions on Google's Cloud ecosystem, including services such as Google DataProc, DataFlow, PubSub, BigQuery, Big Table, Cloud Spanner, CloudSQL, and AlloyDB.

  • Experience working with Data Pipelines: Knowledge of developing and managing batch and real-time data pipelines for Data Warehouse and Datalake, including data migration and integration.

  • Exposure to Open-Source Technologies: Familiarity with using Google Cloud Storage, BigTable, BigQuery DataProc with Spark and Hadoop, Google DataFlow with Apache Beam or Python, and other open-source technologies like Apache Airflow, dbt, Spark/Python, or Spark/Scala.

  • Experience working with Data Platform Management: Understanding of scheduling and managing the data platform using Google Cloud Scheduler and Cloud Composer (Airflow).

  • Exposure to Data Engineering Solutions: Familiarity with implementing data engineering solutions using various Google services and open-source technologies.

Preferred technical and professional experience

  • Proficiency in Apache Airflow: Experience working with Apache Airflow for scheduling and managing data pipelines is beneficial. Familiarity with Cloud Composer (Airflow) is also desirable.

  • Knowledge of dbt: Exposure to dbt and its application in data engineering solutions is advantageous.

  • Familiarity with Spark/Scala: Experience working with Spark/Scala is beneficial for developing and managing data pipelines.

Apply for this position