AI Engineer
Role details
Job location
Tech stack
Job description
As an AI Engineer specializing in Agentic AI enablement, you will participate in the design and delivery of production-grade agent capabilities built on the enterprise AI Backbone across cloud and edge environments - across supply-chain and global functions. You will be responsible for end-to-end delivery of key agent modules and integration patterns (MCP/tooling), establish strong evaluation and regression discipline, and drive adoption by partnering with transformation teams, BU, platform engineering, and enterprise application owners. You serve as a technical engine for the workstream-translating business workflows into measurable agent outcomes, working to mitigate identified risks, evaluating/experimenting with options/tradeoffs, and working to scale solutions across domains., * Lead design and productionization of high-leverage agent modules and reusable patterns (tool-use orchestration, policies/guardrails, memory, RAG where it adds measurable value), built as composable components and reference implementations. (Execute/Lead)
- Translate ambiguous product/problem statements into concrete agent behaviors and system designs: state models, failure modes, tool contracts, latency budgets, and acceptance criteria that engineering + product can execute against. (Execute/Consult)
- Deliver quickly without sacrificing quality: create thin vertical slices, iterate with evidence, and converge on robust behavior under real-world constraints. (Execute)
- Drive meaningful performance gains via systematic optimization: latency, token efficiency, tool-call success, retrieval quality, and cost per successful task, including remediation of long-tail failure modes. (Execute)
- Proactively identify platformizable opportunities: refactor one-off implementations into shared frameworks/SDKs that reduce build time for others. (Execute/Influence)
Evaluation, Testing & Release Quality (25%)
- Define and implement evaluation strategies for assigned workflows: golden sets, scenario coverage maps, regression suites, online/offline metrics, and release gating thresholds aligned to real business outcomes. (Execute/Consult)
- Build repeatable evaluation systems (templates, labeling guidance, dataset/versioning conventions, dashboards/reports) so evaluation becomes a productized capability, not ad hoc testing. (Execute/Lead)
- Implement robust automated testing across layers: unit tests for prompt/tool wrappers, contract tests for tool schemas, integration tests for toolchains, and agent simulation tests for multi-step flows. (Execute)
- Lead root-cause analysis of quality failures (hallucinations, tool misuse, retrieval misses, routing errors): isolate causes (prompt/tool/data/model), implement corrective actions, and prevent regressions. (Execute)
- Champion evidence-first iteration: decisions and releases are backed by eval results, not gut feel. (Influence)
Model/Prompt Routing Contributions (15%)
- Contribute to router design and task-to-model mapping through routing rules/classifiers, prompt strategies, and model selection policies; validate decisions using evaluation data and runtime telemetry. (Execute/Consult)
- Propose and implement routing improvements when constraints change (pricing, latency, throughput, new model capabilities), with governance-aware rollouts and rollback plans. (Consult/Execute)
- Identify and mitigate routing failure modes (over-escalation to expensive models, under-routing causing quality loss, brittle heuristics) and improve robustness using lightweight ML or rules where appropriate. (Execute)
Integration with Tools and MCPs (15%)
- Lead implementation of MCP connectors/clients for enterprise apps and internal data products with strong engineering hygiene: schema/versioning discipline, typed contracts, scopes/permissions, auditability, and integration test strategy. (Execute/Consult)
- Build reusable integration patterns: standardized tool metadata, error normalization, retries/timeouts, idempotency, pagination handling, and consistent auth patterns to accelerate onboarding of new tools. (Execute)
- Collaborate with security/data owners to ensure secure-by-design tool access (least privilege, logging, PII handling, policy enforcement). (Consult/Execute)
Operational Readiness, Collaboration & Continuous Improvement (10%)
- Ensure production readiness for owned components: telemetry coverage, structured logging, traceability for tool calls, SLIs/SLO alignment (latency, success rate, cost), and participation in incident response and postmortems. (Execute/Consult)
- Proactively identify delivery risks (dependencies, rate limits, data quality, security scopes, vendor constraints) and drive resolution with clear tradeoffs and recommendations. (Consult/Influence)
- Mentor peers through technical leadership: raise code quality, share patterns, review PRs for correctness/performance/security, and contribute to internal playbooks. (Influence)
Decision-Making Autonomy: High-moderate - significant autonomy in AI engineering design choices and evaluation approach; aligns with standards and escalates policy/security-impacting decisions. Supervision Required: Moderate-low - general direction from Transformation and Tech Executives and SME; self-directed execution with periodic design, execution and RoI reviews. Complexity of Role: High - spans agent design, evaluation rigor, integration complexity, and cross-team delivery and deep business/domain expertise under evolving constraints. Cross-Functional Interactions: Yes - continuous interaction with domain transformation leads, platform/SRE, security, and enterprise app teams, Identify any differentiating behaviors, leadership skills or soft skills required for success in the role.
- Ownership: drives outcomes end-to-end for a workstream area (not just tasks)
- Collaboration & customer focus: influences stakeholders to deliver workflow value and adoption
- Communication & adaptability: providing clarity on progress, risks, and evaluation evidence to business, technical and PMO stakeholders
- Proactiveness & initiative anticipates constraints, proposes options/tradeoffs early
- Strategic thinking: contributes to roadmap sequencing and reusable patterns across domains
Key Differentials :
- Demonstrates proven history of creating solutions with order-of-magnitude improvements over standard approaches
- Possesses rare combination of deep technical expertise and business understanding
- Creates solutions that scale beyond their direct involvement (leveraged impact)
- Consistently elevates the performance of teams and individuals around them
- Identifies and solves problems others haven't recognized yet
- Maintains extraordinary productivity while ensuring knowledge transfer
- Balances technical perfectionism with pragmatic business value
- Communicates complex technical concepts effectively to both technical and non-technical stakeholders
Requirements
Do you have experience in gRPC?, Do you have a Master's degree?, * Bachelor's in CS/AI/ML or equivalent experience required
- Master's preferred
- 6-8 year experience in Software life cycle
- Expertise in ML (structured and unstructured data) development and engineering
- Proven experience shipping LLM/agent solutions to production with measurable quality and operational practices.
Required Expertise
- Advanced Software Engineering: Python (and Java) mastery with distributed systems expertise; performance optimization (profiling, parallelization); architecture patterns (e.g., FastAPI, asyncio, Pydantic)
- LLM & Agent Systems: Multi-agent orchestration (LangChain, LangGraph, CrewAI); advanced prompt engineering; custom agent memory architectures; model optimization techniques
- Evaluation Framework Development: Statistical evaluation design (confidence intervals, power analysis); benchmark creation; instrumentation frameworks (e.g., MLflow, Arise); regression testing systems
- ML Operations: Production deployment pipelines (Docker, Kubernetes, Ray); model registry management; scaled inference optimization; GPU utilization optimization
- Enterprise Integration: Enterprise connector development; scalable API architectures; data pipeline engineering (Kafka, gRPC, Redis); authorization protocol implementation
- Observability Engineering: Telemetry system design (Prometheus, OpenTelemetry); automated anomaly detection; distributed tracing; performance dashboarding (Grafana)
- System Architecture: Microservice design patterns; high-throughput event processing; fault-tolerance implementation; horizontal scaling architectures
- Technical Leadership: Architecture governance systems; engineering standards development; build-vs-buy evaluation frameworks; technical roadmap creation
Good-to-have Skills
- Full-stack dev experience on modern stack
- Modelling User Interactions with AI Systems; Modeling multi-agent behaviour loops with tools like Temporal
- Agentic memory Patterns and usage with tools like MEM0 and Temporal
- Experience with Agentic RAG; Domain level Semantic Layer Designs with Graph and Vector DBs
Benefits & conditions
Pulled from the full job description
- Paid parental leave
- Parental leave
- Health insurance
- Retirement plan
- Paid time off
- Vision insurance
- Dental insurance, * The expected compensation range for this position is between $93,500 - $156,450.
- Location, confirmed job-related skills, experience, and education will be considered in setting actual starting salary. Your recruiter can share more about the specific salary range during the hiring process.
- Bonus based on performance and eligibility target payout is 10% of annual salary paid out annually.
- Paid time off subject to eligibility, including paid parental leave, vacation, sick, and bereavement.
- In addition to salary, PepsiCo offers a comprehensive benefits package to support our employees and their families, subject to elections and eligibility: Medical, Dental, Vision, Disability, Health, and Dependent Care Reimbursement Accounts, Employee Assistance Program (EAP), Insurance (Accident, Group Legal, Life), Defined Contribution Retirement Plan.