Professional Data Engineer (Based in Dubai)
Role details
Job location
Tech stack
Job description
We are looking for a Data Engineer to build reliable, scalable data pipelines and contribute to the core data ecosystem that powers analytics, AI/ML, and emerging Generative AI use cases. You will work closely with senior engineers and data scientists to deliver high-quality pipelines, models, and integrations that support business growth and internal AI initiatives., Core Engineering
- Build and maintain batch and streaming data pipelines with strong emphasis on reliability, performance, and efficient cost usage.
- Develop SQL, Python, and Spark/PySpark transformations to support analytics, reporting, and ML workloads.
- Contribute to data model design and ensure datasets adhere to high standards of quality, structure, and governance.
- Support integrations with internal and external systems, ensuring accuracy and resilience of data flows.
GenAI & Advanced Data Use Cases
- Build and maintain data flows that support GenAI workloads (e.g., embedding generation, vector pipelines, data preparation for LLM training and inference).
- Collaborate with ML/GenAI teams to enable high-quality training and inference datasets.
- Contribute to the development of retrieval pipelines, enrichment workflows, or AI-powered data quality checks.
Collaboration & Delivery
- Work with Data Science, Analytics, Product, and Engineering teams to translate data requirements into reliable solutions.
- Participate in design reviews and provide input toward scalable and maintainable engineering practices.
- Uphold strong data quality, testing, and documentation standards.
- Support deployments, troubleshooting, and operational stability of the pipelines you own.
Professional Growth & Team Contribution
- Demonstrate ownership of well-scoped components of the data platform.
- Share knowledge with peers and contribute to team learning through code reviews, documentation, and pairing.
- Show strong execution skills - delivering high-quality work, on time, with clarity and reliability.
Impact of the Role
In this role, you will help extend and strengthen the data foundation that powers analytics, AI/ML, and GenAI initiatives across the company. Your contributions will improve data availability, tooling, and performance, enabling teams to build intelligent, data-driven experiences.
Tech Stack
- Languages: Python, SQL, Java/Scala
- Streaming: Kafka, Kinesis
- Data Stores: Redshift, Snowflake, ClickHouse, S3
- Orchestration: Dagster (Airflow legacy)
- Platforms: Docker, Kubernetes
- AWS: DMS, Glue, Athena, ECS/EKS, S3, Kinesis
- ETL/ELT: Fivetran, dbt
- IaC: Terraform + Terragrunt
Requirements
- 5+ years of experience as a Data Engineer.
- Strong SQL and Python skills; good understanding of Spark/PySpark.
- Experience building and maintaining production data pipelines.
- Practical experience working with cloud-based data warehouses and data lake architectures.
- Experience with AWS services for data processing (Glue, Athena, Kinesis, Lambda, S3, etc.).
- Familiarity with orchestration tools (Dagster, Airflow, Step Functions).
- Solid understanding of data modeling, and data quality best practices.
- Experience working with CI/CD pipelines or basic automation for data workflows.
- Exposure to Generative AI workflows or willingness to learn: embeddings, vector stores, enrichment pipelines, LLM-based data improvements, retrieval workflows.
Preferred Experience:
- Experience with Real Estate
- Familiarity with ETL tools: Fivetran, dbt, Airbyte.
- Experience with real-time analytics solutions (ClickHouse, Pinot, Rockset, Druid).
- Familiarity with BI tools (QuickSight, Looker, PowerBI, Tableau).
- Exposure to tagging/tracking tools (Snowplow, Tealium).
- Experience with Terraform & Terragrunt.
- Knowledge of GCP or Google Analytics is a plus.
Benefits & conditions
DUBAI BASED ROLE. Relocation would be required but with highly competitive, tax free salary package.