Principal Data Engineer
Role details
Job location
Tech stack
Job description
We're seeking a versatile Principal Data Engineer who can work across the full stack of Anaplan AI applications, from model integration, prompt engineering and set the technical direction for how we ingest, transform, store, serve, and govern the data that powers our LLM-based and agentic systems. You'll build AI features that can be used in real-time. This will help business users use GenAI in their planning workflows. You'll need both deep knowledge of machine learning and strong data engineering skills.
Your Impact
- Design and build the retrieval layer powering RAG and agentic workloads-including vector and graph databases, hybrid search, and architecting knowledge graphs that capture the semantics of customer models.
- Develop end-to-end GenAI features including backend API services, model integration, model monitoring, evaluations and deployments
- Engineer feature and context pipelines balancing batch and streaming patterns to feed forecasting and anomaly-detection models, collaborating closely with data scientists to productionise algorithms.
- Build the data plane for evaluation, implementing rigorous frameworks to continuously monitor, measure, and improve GenAI feature quality, accuracy, latency, and user satisfaction.,
- Collaborate with data scientists to productionise ML models and forecasting algorithms
Requirements
Do you have experience in Software development?, Do you have a Master's degree?, * Extensive background in Data Science Engineering, with a clear track record of principal-level technical leadership.
- Hands-on experience building and shipping AI/ML products in production
- Deep practical experience with LLM-based systems: RAG architectures, embedding pipelines, prompt and response logging, evaluation frameworks.
- Hands-on expertise with vector databases, graph databases, and knowledge graphs
- End-to-end exposure in model lifecycle development, including extensive experience training and deploying ML models in production environments.
- Deep knowledge of LLM APIs, prompt engineering, and conversational AI patterns.
- Proficiency in Python and modern software development practices (testing, code review, CI/CD).
Preferred Skills
- Hands-on experience with cloud-native ML infrastructure platforms
- Knowledge of vector databases (Pinecone, Weaviate, Qdrant) and embedding models
- Experience with model serving frameworks (vLLM, TensorRT, Ray)
- Familiarity with Anaplan or similar enterprise planning platforms
- Experience with A/B testing and experimentation frameworks for AI features
- Experience with model observability tools (LangSmith, W&B, MLflow)