Data Engineer (Spark / Google Cloud Platform - Data Migration)
Euclid Innovations
5 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Tech stack
Artificial Intelligence
Google BigQuery
Cloud Storage
Data Transformation
Data Migration
Hadoop Distributed File System
Python
Machine Learning
Microsoft SQL Server
Network File Systems
Data Streaming
Google Cloud Platform
Feature Engineering
System Availability
Spark
Collibra
Machine Learning Operations
Data Pipelines
Job description
- Migrate data from on-prem systems (HDFS, NFS, SQL Server, etc.) to Google Cloud Platform
- Build and optimize data pipelines using Spark and Python
- Ensure data availability for ML training and inference workflows
- Maintain data quality, consistency, and schema compatibility
- Collaborate with ML Engineers and MLOps teams
Requirements
-
Strong Spark + Python (mandatory)
-
Experience with data migration (on-prem * cloud)
-
Hands-on with:
-
Data pipelines (batch/streaming)
-
Data transformation and processing
Google Cloud Platform experience:
- BigQuery
- Cloud Storage
Experience with schema handling, data quality, and validation
Nice to Have
- Experience in hybrid environments (Google Cloud Platform + private cloud)
- Dataplex or data governance tools
- Feature engineering / ML data support
- Experience working with ML/AI teams