Data Engineer
Role details
Job location
Tech stack
Requirements
Do you have experience in Spark implementation?, Must Haves: GCP SPARK Airflow SQL GCP Data services AI/ML IS A super nice to have Job description: KEY RESPONSIBILITIES * Design and build scalable ETL/ELT pipelines using Apache Airflow, Apache Spark, and GCP Dataflow * Develop and maintain BigQuery data models, schemas, and performance-optimized SQL queries * Build and maintain data pipelines feeding AI/ML feature stores and forecasting models * Collaborate with AI Developers to ensure high-quality, low-latency data access for model training * Manage and optimize Cloud Composer DAGs and pipeline orchestration * Implement data quality monitoring, alerting, and lineage tracking * Participate in data platform architecture decisions and documentation REQUIRED QUALIFICATIONS * 3+ years (Intermediate) or 5+ years (Specialist) of data engineering experience * Hands-on experience with Apache Airflow for pipeline orchestration * Proficiency in Apache Spark for large-scale data processing * Strong SQL skills including complex query optimization and BigQuery-specific capabilities * Experience with GCP data services: BigQuery, Cloud Storage, Pub/Sub, Dataflow * Solid understanding of ETL/ELT patterns and data warehousing principles PREFERRED QUALIFICATIONS * GCP Professional Data Engineer certification * Experience supporting ML/AI data infrastructure (feature engineering, training datasets) * Familiarity with real-time streaming (Kafka, Dataflow/Flink) * Retail or large-scale consumer data experience