Data Platform Architect

Emergere Technologies
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Artificial Intelligence
Code Generation
Continuous Integration
Data Architecture
Data Dictionary
Information Engineering
Data Governance
Data Vault Modeling
Data Warehousing
Dimensional Modeling
Hive
Machine Learning
Systems Development Life Cycle
Software Tools
SAP Applications
Search Technologies
SQL Databases
Enterprise Data Management
Data Ingestion
Large Language Models
Generative AI
Data Lake
PySpark
Data Management
Machine Learning Operations
Oracle Ebusiness
Terraform
Databricks

Job description

  • Define the end-to-end vision, architecture, and roadmap for the enterprise data platform.
  • Design and implement a greenfield EDW / Lakehouse architecture (Bronze, Silver, Gold layers).
  • Build scalable data ingestion frameworks integrating ERP, WMS, TMS, MES, procurement, and logistics systems.
  • Develop pipelines using PySpark, Spark SQL, Delta Live Tables (DLT), and orchestrate with Databricks Workflows.
  • Implement data governance, lineage, and security using Unity Catalog.
  • Embed AI/ML and GenAI capabilities such as:
  • LLM-driven schema mapping
  • Vector search & RAG
  • Forecasting models
  • Natural-language BI solutions
  • Drive AI-accelerated SDLC delivery using copilots, agents, and automation tools.
  • Establish standards for data quality, monitoring, and reliability.
  • Lead and mentor cross-functional teams and collaborate with business stakeholders., Discovery & Design AI-based source profiling, schema inference, auto data dictionaries Data Modeling LLM-generated dimensions, facts, and lineage Pipeline Build AI-assisted PySpark/SQL/DLT code generation Testing & QA Auto-generated test cases, synthetic data creation Deploy & Operate CI/CD automation, job optimization, anomaly detection Consumption Natural-language BI, GenAI assistants, predictive insights

Requirements

  • 10+ years in Data Engineering / Architecture
  • Strong Supply Chain domain knowledge (Plan, Source, Make, Deliver, Return)
  • Hands-on experience with Databricks Lakehouse Platform
  • Expertise in:
  • PySpark, Spark SQL
  • Delta Lake & Delta Live Tables
  • Databricks Workflows
  • Experience with Unity Catalog (governance, lineage, access control)
  • Strong knowledge of:
  • Dimensional Modeling & Data Vault
  • Medallion Architecture
  • Data Product Design
  • Experience with CI/CD & IaC (Terraform / Databricks Asset Bundles)
  • Solid exposure to AI/ML technologies:
  • GenAI, LLMs, RAG, Vector Search
  • MLOps & Feature Stores
  • Proven experience using AI tools (Copilots/Agents) to accelerate SDLC
  • Strong leadership and stakeholder management skills, * Experience building a greenfield enterprise data platform
  • Exposure to ERP systems like SAP or Oracle EBS
  • Experience deploying AI assistants or agents on data platforms

Apply for this position