GCP Data Engineer

Data Inc
Mountain View, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Mountain View, United States of America

Tech stack

Java
Data analysis
Big Data
Google BigQuery
Cloud Computing
Computer Programming
Data as a Services
Data Architecture
Data Validation
Information Engineering
Data Governance
Data Infrastructure
ETL
Data Stores
Data Systems
Data Warehousing
DevOps
Data Flow Control
Hive
Python
Machine Learning
NoSQL
Cloudera
SQL Databases
Data Streaming
Google Cloud Platform
Snowflake
Spark
Build Management
Containerization
Data Lake
PySpark
Kubernetes
Low Latency
Apache Flink
Real Time Data
Kafka
Data Management
Video Streaming
Stream Processing
Data Pipelines
Docker
Redshift

Job description

We are looking for a highly experienced Senior Data Engineer with strong expertise in real-time data processing and scalable data architectures. You will play a key role in designing, building, and optimizing data platforms that support analytics, reporting, and machine learning use-cases. You will work closely with cross-functional teams (Data Science, Analytics, Product) to deliver high-performance data infrastructure and tools., * Design & Build Data Pipelines: Architect, develop, and maintain robust ETL/ELT workflows for batch and real-time data ingestion and processing using Apache Spark (PySpark/Scala) and streaming technologies.

  • Real-Time Streaming: Implement and manage scalable streaming platforms using Apache Kafka (or similar messaging systems like Pub/Sub/Flink), ensuring reliable data flow with low latency.
  • Optimize Data Workloads: Tune Spark jobs, streaming processes, repository schemas, and SQL queries to maximize performance, minimize cost, and ensure efficient resource utilization.
  • Architect Scalable Data Systems: Build and maintain modern data architectures including data lakes, data warehouses (BigQuery), and metadata frameworks that support analytical and ML workloads.
  • Data Quality & Monitoring: Implement automated data quality checks, monitoring dashboards, alerts, and self-healing workflows to maintain high-fidelity data.
  • Cloud & DevOps Integration: Collaborate with Cloud and DevOps teams to deploy solutions leveraging Google Cloud Platform services, containerization (Docker), and orchestration tools (Kubernetes).
  • Documentation & Best Practices: Maintain technical documentation, enforce data governance standards, and advocate for best practices in data engineering.

Requirements

  • Minimum 12 years of industry experience building enterprise data solutions.
  • 8+ years of recent, hands-on experience with Google Cloud Platform data services.
  • Proven track record of delivering productionized data platforms supporting analytics and ML., * Programming: Strong proficiency in Python, SQL, with working knowledge of Scala or Java.
  • Big Data Frameworks: Expertise in Apache Spark (Spark SQL, DataFrames, Structured Streaming).
  • Streaming Technologies: Hands-on experience with Apache Kafka, Google Pub/Sub, or similar systems.
  • Cloud Platforms: Solid experience with Google Cloud Platform (Google Cloud Platform) data services (BigQuery, Dataflow, Pub/Sub, Dataproc, etc.).
  • Data Stores: Experience with data warehousing solutions such as BigQuery, Snowflake, Redshift, and familiarity with NoSQL databases.

Apply for this position