Senior Data Engineer with GCP

Mphasis
New York, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

New York, United States of America

Tech stack

Java
API
Airflow
Apache HTTP Server
Batch Processing
Google BigQuery
Cloud Storage
Code Review
Computer Programming
Continuous Integration
Data as a Services
Data Transmissions
Information Engineering
Data Governance
ETL
Data Virtualization
DevOps
Distributed Systems
Data Flow Control
Python
Network Connections
Performance Tuning
DataOps
SQL Databases
Data Streaming
Google Cloud Platform
Apigee
Real Time Data
Kafka
Data Management
Terraform
Looker Analytics
Apache Beam

Job description

Architect and own scalable, secure, cloud-native data platforms on Google Cloud Platform Design, build, and optimize batch and real-time data pipelines using BigQuery, Dataflow, Pub/Sub, and Dataproc Lead BigQuery performance tuning and cost optimization (partitioning, clustering, query efficiency) Orchestrate workflows using Cloud Composer (Apache Airflow) Enable Al/ML and GenAl integration via Vertex Al and BigQuery ML Enforce data governance, security, reliability, and FinOps best practices Mentor engineers, conduct design/code reviews, and set enterprise data engineering standards

  • Collaborate with product, analytics, and data science teams to deliver business-critical insights

Requirements

  • GCP Data Services: BigQuery, Dataflow (Apache Beam), Pub/Sub, Cloud Storage, Cloud Composer, Dataproc

  • Programming & SQL: Advanced SQL, Python (Java/Scala a plus)

  • Data Engineering: ETL/ELT, streaming & batch processing, data modeling, distributed systems

  • Modern Architectures: Lakehouse, Apache Iceberg, Data Mesh concepts

  • Al/ML Enablement: Vertex Al, BigQuery ML, GenAl-ready pipelines DevOps & laC: Terraform, CI/CD, DataOps practices Leadership: Architecture ownership, mentoring, stakeholder communication, problem solving

  • Certification: Google Cloud Professional Data Engineer (strongly preferred / often mandatory) In addition to big query, storage bucket, following are necessary skills - data flow, composer, cloud scheduler, Pubsub and Kafka, Apigee gateway and API, Dataplex, basic knowledge of network connectivity (knowledge on data catalog, DLP, BQDTS, STS and other data transfer methodologies). Reporting background (powerbi) and ICEBERG are MUST. Data virtualization (Trenio or equivalent), Looker and GCP vertex will be a plus.

Apply for this position