Data Engineer - GCP
Euclid Innovations
Fort Mill, United States of America
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Fort Mill, United States of America
Tech stack
Big Data
Google BigQuery
Cloud Computing
Information Engineering
Data Integration
ETL
Data Migration
Distributed Systems
Data Flow Control
Python
Cloudera
Data Streaming
Data Processing
Google Cloud Platform
Data Ingestion
Spark
Build Management
PySpark
Data Pipelines
Job description
- Design and build scalable ETL/data pipelines using Spark and Python
- Develop data workflows to ingest, transform, and move large datasets
- Implement data routing logic to direct data to:
GCP (BigQuery, Dataflow, Dataproc)
On-prem platforms (DPC)
- Ensure data quality, validation, and reconciliation across systems
- Collaborate with data science and platform teams to support predictive model pipelines
- Optimize performance and scalability for high-volume data processing
Requirements
- Strong hands-on experience with Apache Spark / PySpark for large-scale data processing
- Proficiency in Python for data engineering (ETL pipelines)
- Experience designing and developing data pipelines / data engineering workflows
- Solid background in ETL, data ingestion, transformation, and data movement
- Experience working with big data technologies and handling large datasets (batch/streaming)
- Experience with cloud platforms - GCP (Google Cloud Platform)
o BigQuery, Dataflow, Dataproc, GCS (Google Cloud Storage)
- Experience with data migration / data integration projects
- Understanding of data pipeline architecture and distributed systems