GCP Data Engineer

Data Inc

Mountain View, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Mountain View, United States of America

Tech stack

Java

Data analysis

Big Data

Google BigQuery

Cloud Computing

Computer Programming

Data as a Services

Data Architecture

Data Validation

Information Engineering

Data Governance

Data Infrastructure

ETL

Data Stores

Data Systems

Data Warehousing

DevOps

Data Flow Control

Hive

Python

Machine Learning

NoSQL

Cloudera

SQL Databases

Data Streaming

Google Cloud Platform

Snowflake

Spark

Build Management

Containerization

Data Lake

PySpark

Kubernetes

Low Latency

Apache Flink

Real Time Data

Kafka

Data Management

Video Streaming

Stream Processing

Data Pipelines

Docker

Redshift

Job description

We are looking for a highly experienced Senior Data Engineer with strong expertise in real-time data processing and scalable data architectures. You will play a key role in designing, building, and optimizing data platforms that support analytics, reporting, and machine learning use-cases. You will work closely with cross-functional teams (Data Science, Analytics, Product) to deliver high-performance data infrastructure and tools., * Design & Build Data Pipelines: Architect, develop, and maintain robust ETL/ELT workflows for batch and real-time data ingestion and processing using Apache Spark (PySpark/Scala) and streaming technologies.

Real-Time Streaming: Implement and manage scalable streaming platforms using Apache Kafka (or similar messaging systems like Pub/Sub/Flink), ensuring reliable data flow with low latency.
Optimize Data Workloads: Tune Spark jobs, streaming processes, repository schemas, and SQL queries to maximize performance, minimize cost, and ensure efficient resource utilization.
Architect Scalable Data Systems: Build and maintain modern data architectures including data lakes, data warehouses (BigQuery), and metadata frameworks that support analytical and ML workloads.
Data Quality & Monitoring: Implement automated data quality checks, monitoring dashboards, alerts, and self-healing workflows to maintain high-fidelity data.
Cloud & DevOps Integration: Collaborate with Cloud and DevOps teams to deploy solutions leveraging Google Cloud Platform services, containerization (Docker), and orchestration tools (Kubernetes).
Documentation & Best Practices: Maintain technical documentation, enforce data governance standards, and advocate for best practices in data engineering.

Requirements

Minimum 12 years of industry experience building enterprise data solutions.
8+ years of recent, hands-on experience with Google Cloud Platform data services.
Proven track record of delivering productionized data platforms supporting analytics and ML., * Programming: Strong proficiency in Python, SQL, with working knowledge of Scala or Java.
Big Data Frameworks: Expertise in Apache Spark (Spark SQL, DataFrames, Structured Streaming).
Streaming Technologies: Hands-on experience with Apache Kafka, Google Pub/Sub, or similar systems.
Cloud Platforms: Solid experience with Google Cloud Platform (Google Cloud Platform) data services (BigQuery, Dataflow, Pub/Sub, Dataproc, etc.).
Data Stores: Experience with data warehousing solutions such as BigQuery, Snowflake, Redshift, and familiarity with NoSQL databases.