AI Engineer Manager (Platforms), IT Business Services
Role details
Job location
Tech stack
Job description
We are seeking a motivated and hands-on Senior AI Engineer to join our ITS AI CoE. The ideal candidate will be responsible for building the next generation of intelligent systems, moving beyond simple chatbots to fully autonomous, goal-oriented Agentic Workflows. You won't just query APIs; you will design cognitive architectures that reason, plan, and execute complex tasks. You will play a key role in implementing Generative AI solutions, translating cutting-edge research into scalable, production-grade solutions that redefine enterprise capabilities. This role involves deep technical work with LLMs, SLMs, Multi-modal systems, Vector and Graph-based knowledge retrieval. The role will be reporting to the Head of AI COE., * Stay at the forefront of platform innovation by tracking and integrating the latest updates from Microsoft Copilot Agent Builder, Workflow Builder, AI Foundry, and GCP, leveraging these advancements to enhance enterprise AI capabilities.
- Design and implement agentic (multi-agent systems) and cognitive architectures (using frameworks like LangGraph and Deep Agents) that can perform complex reasoning, tool use, and long-horizon planning.
- Build sophisticated Retrieval-Augmented Generation (RAG) pipelines, integrating Knowledge Graphs (GraphRAG) and Vector Databases to provide deep, grounded context to LLMs.
- Write robust, production-ready Python code, utilizing CI/CD pipelines to ensure versioning, automated evaluation, and seamless deployment on Azure Cloud and Kubernetes environments.
- Apply data science methodologies to curate high-quality datasets, perform statistical analysis, and fine-tune models to improve domain-specific performance.
- Establish and maintain comprehensive MLOps and LLMOps workflows, managing the full lifecycle of models including versioning, experiment tracking, continuous training, and drift detection.
- Develop solutions that seamlessly process and generate text, image, audio, and video, creating rich user experiences.
- Optimize the performance of both new and existing AI solutions, specifically targeting reductions in latency and token consumption to ensure efficient, scalable, and cost-effective deployments.
- Implement comprehensive evaluation frameworks to measure agent performance (accuracy, latency, token cost) and debugging tools for AI workflows.
- Master the management of the LLM context window by designing strategies for efficient state tracking, conversation history compression, and dynamic information injection, ensuring agents maintain high coherence and accuracy during long-running, multi-step workflows, following best context engineering principles.
- Rapidly prototype AI solutions using emerging AI concepts/architectures and advanced techniques and validate new concepts.
- Work closely with software engineers, product managers, and other stakeholders to integrate ethical considerations into AI development.
- Stay updated on the latest trends, research, and best practices in AI.
- Propose and implement improvements to existing processes and methodologies.
Requirements
- A master's or bachelor's degree in computer science, AI, or a related quantitative field.
- Hands-on experience building stateful applications with Deep Agents, LangGraph, good to have knowledge on self-hosting LangGraph on Azure internal servers.
- Experience working with Microsoft Copilot Agent Builder, Workflow Builder, AI Foundry, and GCP.
- Good understanding of LLM application stack (Context management, prompt engineering, Knowledge graphs, Vector DBs, Orchestrators, Evaluators).
- Experience with LLM tracing, monitoring, and debugging tools such as LangSmith, LangFuse, or similar other tools to ensure agent reliability.
- Strong foundation in Data Science and Machine Learning, with experience in statistical modelling, feature engineering, and data manipulation using libraries like Pandas, NumPy, and Scikit-learn.
- Expertise working with Time Series, Regression, and Classification problems.
- Expertise working with ML bagging and boosting models (LGBM, XGBoost, CatBoost).
- Expertise in building and fine-tuning neural networks for convolution, regression, and classification tasks.
- Experience with implementing RAG application evaluation using libraries like Ragas or DeepEval to measure accuracy and retrieval quality.
- Experience in designing and deploying high-performance microservices using FastAPI.
- Proficiency in Python (object-oriented design, asyncio, Pydantic) and testing frameworks like Pytest.
- Experience with graph databases (e.g., Neo4j) and vector stores (e.g., Qdrant).
- Proficiency in building automated deployment pipelines using Azure DevOps (or GitHub Actions) and managing Kubernetes releases with Helm Charts.
- Excellent problem-solving skills and attention to detail.
- Strong communication and interpersonal skills.
- Ability to work effectively in a collaborative, fast-paced environment.
About the company
Deloitte drives progress. Our firms around the world help our clients become market leaders wherever they compete. Deloitte invests in outstanding people with diverse talents and backgrounds, empowering them to achieve more than they can elsewhere. Our work combines consulting with action and integrity. We believe that when our clients and society are stronger, so are we.