GCP Data Architect
Quantum Technologies
San Jose, United States of America
6 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
San Jose, United States of America
Tech stack
Airflow
Big Data
Google BigQuery
Cloud Computing
Cloud Storage
Computer Programming
Continuous Integration
Information Engineering
Data Infrastructure
DevOps
Data Flow Control
Github
Python
Machine Learning
Performance Tuning
SQL Databases
Data Streaming
Workflow Management Systems
Data Processing
Google Cloud Platform
Cloud Platform System
Data Ingestion
Sql Optimization
Gitlab
GIT
Deployment Automation
Software Version Control
Data Pipelines
Job description
We are looking for a highly skilled and motivated Data Engineer to join our team. The ideal candidate will be responsible for designing, building, and maintaining scalable data infrastructure that drives business intelligence, advanced analytics, and machine learning initiatives. You must be comfortable working autonomously, navigating complex challenges, and driving projects to successful completion in a dynamic cloud environment. Core Responsibilities
- Design and Optimization: Design, implement, and optimize clean, well-structured, and performant analytical datasets to support high-volume reporting, business analysis, and data science model development.
- Pipeline Development: Architect, build, and maintain scalable and robust data pipelines for diverse applications, including business intelligence, advanced analytics.
- Big Data & Streaming: Implement and support Big Data solutions for both batch (scheduled) and real-time/streaming analytics.
- Collaboration: Work closely with product managers and business teams to understand data requirements and translate them into technical solutions.
Requirements
- Cloud Platform Expertise (GCP Focus): Extensive hands-on experience working in dynamic cloud environments, with a strong preference for Google Cloud Platform (GCP) services, specifically:
- BigQuery: Expert-level skills in data ingestion, performance optimization, and data modeling within a petabyte-scale environment.
- Experience with other relevant GCP services like Cloud Storage, Cloud Dataflow/Beam, or Pub/Sub
- Programming & Querying:
- Python: Expert-level programming proficiency in Python, including experience with relevant data engineering libraries.
- SQL: A solid command of advanced SQL for complex querying, data processing, and performance tuning.
- Data Pipeline Orchestration: Prior experience using workflow management and orchestration tools (eg, Apache Airflow, Cloud Composer, Airflow, Dagster, or similar).
- DevOps/CI/CD: Experience with version control (Git) and familiarity with CI/CD practices and tools (eg, GitLab, GitHub Actions) to automate deployment and testing processes.