Principal Industrial AI Data Architect - US Remote
Role details
Job location
Tech stack
Job description
The Principal Industrial AI Data Architect is responsible for designing and governing the data architecture that enables reliable, scalable AI across industrial environments.
This role ensures that:
-
Data pipelines are aligned with the canonical semantic model
-
Features used in AI models are consistent across training and runtime
-
Industrial data is structured for real-time inference and long-term analytics
This role is the bridge between data, semantics, and AI execution., 1. Define Industrial Data Architecture for AI
Design end-to-end data flows from:
Edge systems cloud AI pipelines edge inference
Define:
- Data storage patterns (time-series, relational, event-based)
- Data movement and transformation strategies
Ensure architecture supports:
- Real-time processing
- Batch analytics
- Model lifecycle integration
- Design Feature Pipelines and Delivery for AI Models
Design and govern the pipelines, storage, and lifecycle that build and deliver features to AI models, based on canonical definitions established by the Principal Manufacturing & Semantic Architect.
- Define feature engineering pipelines for both training (cloud) and inference (edge) environments
- Ensure consistency between training datasets and runtime inference data
- Prevent feature drift and data mismatch through automated validation
- Integrate Semantic Model with Data Pipelines
Translate canonical semantic definitions into:
- Physical data models
- Schemas
- Pipelines
Ensure all data structures conform to:
- Enterprise standards
- Platform contracts
Additional Job Responsibilities
- Enable Scalable AI Model Integration
Define data interfaces required by:
- Internal AI teams
- External model providers
Support:
- Model versioning
- Feature compatibility
- Performance validation
- Design for Multi-Tenant and Product Use Cases
Ensure data pipelines and access patterns support multi-tenant environments, including:
- Customer data isolation and secure access controls
- Scalable onboarding of new tenants and use cases
- Reuse of data pipelines across customers and deployments
Note: The underlying data model for multi-tenancy is governed by the Principal Manufacturing & Semantic Architect.
- Collaborate Across Teams
Partner with:
- Principal Manufacturing & Semantic Architect (canonical model definition and feature semantics)
- Principal Edge & OT Architect (edge data ingestion and inference data requirements)
- Platform Engineering (implementation and infrastructure)
- AI/Data Science teams (model requirements and validation)
Ensure consistent execution across domains.
Requirements
-
Strong system design and data modeling skills
-
Ability to connect business, operational, and AI requirements
-
High attention to data consistency and integrity
-
Cross-functional collaboration
Minimum Qualifications
-
Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred)
-
10+ years of experience in data architecture, industrial data systems, or IoT platforms
-
Strong experience with time-series data (e.g., historian systems), data pipelines, and ETL/ELT
-
Strong experience with distributed data systems
-
Understanding of AI/ML data requirements and feature engineering concepts, Experience with:
-
Industrial IoT or edge-to-cloud platforms
-
Manufacturing systems (OT + IT integration)
-
Cloud data platforms (AWS preferred)
Familiarity with:
- Streaming architectures
- Event-driven systems
- Data governance frameworks
Other
Leadership Expectations
Operate as a thought leader in industrial data architecture and AI data strategy
Influence without direct authority across multiple teams and partners
Drive standards adoption for data pipelines and AI data practices across internal and external stakeholders
Balance long-term architectural vision with near-term delivery needs