Senior Full-Stack Engineer, Data platforms (GCP) H/F - IBM Client Innovation Center

IBM

Canton of Colombes-1, France

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Canton of Colombes-1, France

Tech stack

Airflow

BigTable

Google BigQuery

Cloud Computing

Cloud Storage

Data Infrastructure

Data Migration

Data Systems

Data Warehousing

Data Flow Control

Hadoop

Python

Open Source Technology

Cloud Services

Cloudera

Google Cloud Platform

Spark

Data Layers

Build Management

Data Lake

Data Management

Stream Processing

Data Pipelines

Apache Beam

Job description

As a Data Engineer specializing in Google's data platforms, you will design, build, and maintain data engineering solutions on Google's Cloud ecosystem. You will utilize various Google services to develop batch and real-time data pipelines, perform data migration, and design data layers.

Your primary responsibilities will include:

Design Data Pipelines: Design and build data engineering solutions using Google services such as DataProc, DataFlow, PubSub, BigQuery, Big Table, Cloud Spanner, CloudSQL, and AlloyDB for batch and real-time data processing.
Develop Data Migration: Develop and manage batch and real-time data pipelines for Data Warehouse and Datalake, ensuring efficient data migration and integration.
Manage Data Platform: Schedule and manage the data platform using Google Cloud Scheduler and Cloud Composer (Airflow), ensuring seamless data workflow and pipeline management.
Implement Data Solutions: Implement data engineering solutions using Google Cloud Storage, BigTable, BigQuery DataProc with Spark and Hadoop, Google DataFlow with Apache Beam or Python, and other open-source technologies.
Optimize Data Pipelines: Optimize and maintain data pipelines for efficiency, scalability, and reliability, ensuring high-quality data output.

Requirements

Exposure to Google Cloud Ecosystem: Familiarity with designing, building, and maintaining data engineering solutions on Google's Cloud ecosystem, including services such as Google DataProc, DataFlow, PubSub, BigQuery, Big Table, Cloud Spanner, CloudSQL, and AlloyDB.
Experience working with Data Pipelines: Knowledge of developing and managing batch and real-time data pipelines for Data Warehouse and Datalake, including data migration and integration.
Exposure to Open-Source Technologies: Familiarity with using Google Cloud Storage, BigTable, BigQuery DataProc with Spark and Hadoop, Google DataFlow with Apache Beam or Python, and other open-source technologies like Apache Airflow, dbt, Spark/Python, or Spark/Scala.
Experience working with Data Platform Management: Understanding of scheduling and managing the data platform using Google Cloud Scheduler and Cloud Composer (Airflow).
Exposure to Data Engineering Solutions: Familiarity with implementing data engineering solutions using various Google services and open-source technologies.

Preferred technical and professional experience

Proficiency in Apache Airflow: Experience working with Apache Airflow for scheduling and managing data pipelines is beneficial. Familiarity with Cloud Composer (Airflow) is also desirable.
Knowledge of dbt: Exposure to dbt and its application in data engineering solutions is advantageous.
Familiarity with Spark/Scala: Experience working with Spark/Scala is beneficial for developing and managing data pipelines.

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all