Remote GCP Data Engineer

Insight Global
Edmundson, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Edmundson, United States of America

Tech stack

Java
API
Artificial Intelligence
Airflow
Audit Trail
BigTable
Google BigQuery
Cloud Computing
Cluster Analysis
Code Review
Continuous Integration
Information Engineering
ETL
Data Security
Data Visualization
Data Warehousing
Relational Databases
DevOps
Data Flow Control
Python
PostgreSQL
Microsoft SQL Server
MySQL
Performance Tuning
Power BI
Shell Script
SQL Databases
Tableau
Google Cloud Platform
Sql Optimization
GIT
Machine Learning Operations
Terraform
Looker Analytics
Software Version Control
Data Pipelines
Apache Beam
Docker

Job description

Design, build, and optimize BigQuery datasets and SQL models

Develop and maintain batch and streaming pipelines using Dataflow/Beam

Orchestrate workflows in Airflow/Cloud Composer

Implement scalable ETL/ELT pipelines with incremental and CDC patterns

Tune performance and manage query/storage costs

Ensure data quality, schema evolution, and lineage tracking

Collaborate with analytics, engineering, and business teams

Secure sensitive data using best practices for compliance

Monitor pipelines, troubleshoot failures, and improve reliability

Contribute to code reviews, documentation, and platform standards

Requirements

5+ years of data engineering experience, including 2+ years on Google Cloud Platform

Expert BigQuery skills:

Advanced SQL (CTEs, window functions, complex joins)

Partitioning, clustering, and query/cost optimization

Materialized & authorized views

Solid understanding of BigQuery architecture (slots, shuffles, distributed execution)

Hands-on experience with Dataflow & Apache Beam (Python or Java)

Batch & streaming pipelines

Performance tuning, monitoring, and error handling

Strong Cloud Composer / Airflow experience

DAG development, operators, orchestration, and troubleshooting

Proven ability to build production-grade ETL/ELT pipelines at terabyte scale

Expert SQL and strong understanding of data warehousing concepts

Strong Python for data pipelines and transformations

Experience with relational databases (Postgres, MySQL, SQL Server)

Data security fundamentals:

Row-level security

PII/PHI handling

Audit logging and access controls

Git-based version control and basic shell scripting Google Cloud Professional Data Engineer certification

Healthcare data experience (clinical or administrative)

dbt for analytics engineering

Infrastructure as Code (Terraform)

DevOps / CI-CD experience for data pipelines

Experience with:

Cloud Spanner

Bigtable / Firestore

Cloud DLP API

Knowledge of data mesh / data fabric architectures

Data visualization tools (Looker, Tableau, Power BI)

ML workflows on GCP (Vertex AI)

Docker & Kubernetes (GKE)

Apply for this position