Data Platform Architect
Emergere Technologies
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Tech stack
Artificial Intelligence
Code Generation
Continuous Integration
Data Architecture
Data Dictionary
Information Engineering
Data Governance
Data Vault Modeling
Data Warehousing
Dimensional Modeling
Hive
Machine Learning
Systems Development Life Cycle
Software Tools
SAP Applications
Search Technologies
SQL Databases
Enterprise Data Management
Data Ingestion
Large Language Models
Generative AI
Data Lake
PySpark
Data Management
Machine Learning Operations
Oracle Ebusiness
Terraform
Databricks
Job description
- Define the end-to-end vision, architecture, and roadmap for the enterprise data platform.
- Design and implement a greenfield EDW / Lakehouse architecture (Bronze, Silver, Gold layers).
- Build scalable data ingestion frameworks integrating ERP, WMS, TMS, MES, procurement, and logistics systems.
- Develop pipelines using PySpark, Spark SQL, Delta Live Tables (DLT), and orchestrate with Databricks Workflows.
- Implement data governance, lineage, and security using Unity Catalog.
- Embed AI/ML and GenAI capabilities such as:
- LLM-driven schema mapping
- Vector search & RAG
- Forecasting models
- Natural-language BI solutions
- Drive AI-accelerated SDLC delivery using copilots, agents, and automation tools.
- Establish standards for data quality, monitoring, and reliability.
- Lead and mentor cross-functional teams and collaborate with business stakeholders., Discovery & Design AI-based source profiling, schema inference, auto data dictionaries Data Modeling LLM-generated dimensions, facts, and lineage Pipeline Build AI-assisted PySpark/SQL/DLT code generation Testing & QA Auto-generated test cases, synthetic data creation Deploy & Operate CI/CD automation, job optimization, anomaly detection Consumption Natural-language BI, GenAI assistants, predictive insights
Requirements
- 10+ years in Data Engineering / Architecture
- Strong Supply Chain domain knowledge (Plan, Source, Make, Deliver, Return)
- Hands-on experience with Databricks Lakehouse Platform
- Expertise in:
- PySpark, Spark SQL
- Delta Lake & Delta Live Tables
- Databricks Workflows
- Experience with Unity Catalog (governance, lineage, access control)
- Strong knowledge of:
- Dimensional Modeling & Data Vault
- Medallion Architecture
- Data Product Design
- Experience with CI/CD & IaC (Terraform / Databricks Asset Bundles)
- Solid exposure to AI/ML technologies:
- GenAI, LLMs, RAG, Vector Search
- MLOps & Feature Stores
- Proven experience using AI tools (Copilots/Agents) to accelerate SDLC
- Strong leadership and stakeholder management skills, * Experience building a greenfield enterprise data platform
- Exposure to ERP systems like SAP or Oracle EBS
- Experience deploying AI assistants or agents on data platforms