GCP Data Engineer

Data Inc

San Jose, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

San Jose, United States of America

Tech stack

Java

Airflow

Google BigQuery

Cloud Database

Computer Programming

Continuous Integration

Information Engineering

ETL

Data Systems

Data Warehousing

DevOps

Hive

Python

Machine Learning

Performance Tuning

Cloud Services

Standard Sql

Software Deployment

SQL Databases

Data Streaming

Workflow Management Systems

Google Cloud Platform

Data Ingestion

Snowflake

Spark

Data Lake

PySpark

Information Technology

Apache Flink

Kafka

Spark Streaming

Data Management

Video Streaming

Stream Processing

Data Pipelines

Docker

Databricks

Job description

We are seeking a highly skilled Senior Data Engineer with strong expertise in Apache Spark, Streaming Technologies, and Google Cloud Platform (GCP) to design, build, and optimize scalable data pipelines supporting analytics, reporting, and machine learning workloads. The ideal candidate will have extensive experience developing both batch and real-time data processing solutions using Spark, Kafka, and cloud-native data services., * Design and develop scalable batch and real-time data pipelines using Spark and Kafka/PubSub

Build and optimize BigQuery-based data platforms and lakehouse architectures
Develop ETL/ELT frameworks for data ingestion, transformation, and delivery
Optimize Spark jobs, SQL queries, and data workflows for performance and cost efficiency
Implement data quality, monitoring, validation, and alerting mechanisms
Collaborate with Data Scientists, Analysts, and Business Teams to deliver reliable data solutions
Support production deployments and troubleshoot complex data engineering issues

Requirements

10+ years of overall IT experience
5+ years of recent hands-on GCP experience
Strong experience building enterprise-scale data platforms and streaming architectures

Required Skills

Strong programming skills in Python and SQL
Hands-on expertise with Apache Spark (PySpark, Spark SQL, DataFrames, Spark Streaming)
Experience with Kafka, Pub/Sub, Flink, or other streaming technologies
Strong knowledge of BigQuery, GCS, Data Lakes, and Data Warehousing concepts
Experience designing and developing ETL/ELT pipelines
Data modeling and performance optimization experience
Experience with Airflow for workflow orchestration
Knowledge of Snowflake, Redshift, or other cloud data warehouses
Experience implementing data quality, monitoring, and alerting solutions

Preferred Skills

Scala or Java development experience
Databricks experience
Docker and Kubernetes
CI/CD and DevOps practices
Experience supporting ML/Data Science workloads

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all