Senior Software Development Engineer - AI / Data Science
Role details
Job location
Tech stack
Job description
-
Design, build, and maintain a scalable AI Platform that supports multiple engineering teams in delivering natural language conversation, RAG-based retrieval, and AI-driven data solutions.
-
Develop core platform services including LLM routing, model abstraction layers, prompt management, and inference orchestration across cloud and on-premise infrastructure.
-
Architect and implement RAG pipelines - including vector store integration, document ingestion, chunking strategies, and retrieval optimization - enabling teams to ground AI responses in enterprise data.
-
Build secure, governed data access patterns that allow AI agents and models to query complex structured and unstructured data sources safely and efficiently.
AI Agent & Agentic Framework Development
-
Engineer agentic capabilities including multi-step reasoning, tool use, and agent-to-agent (A2A) coordination patterns that empower downstream teams to deliver autonomous AI workflows.
-
Implement and maintain MCP (Model Context Protocol) server registrations, enabling standardized tool discovery and invocation across the platform ecosystem.
-
Contribute to the design of circuit breaking, retry logic, and guardrail mechanisms that ensure safe and reliable agentic behavior in production environments.
Platform Enablement & Developer Experience
-
Partner with engineering teams across the organization to understand their AI delivery needs and translate them into platform capabilities, SDKs, and reusable components.
-
Develop and maintain self-service tooling, APIs, and documentation that enable product engineers to integrate AI capabilities without deep platform expertise.
-
Establish and enforce platform engineering standards around security, observability, cost management, and AI governance to ensure responsible AI delivery at scale.
Data & AI Intelligence
-
Build and maintain AI-driven pipelines that process complex customer data to identify, surface, and deliver actionable business value through intelligent automation and insight generation.
-
Collaborate with data scientists to productionize models and analytical workflows, ensuring seamless integration with platform data infrastructure including data lakes, warehouses, and streaming systems.
-
Instrument platform telemetry and evaluation frameworks to measure AI system quality, latency, cost, and business impact across consuming teams.
Technical Leadership & Collaboration
-
Serve as a technical leader and trusted partner across principal engineers, staff engineers, and data science disciplines - driving alignment on platform architecture and engineering standards.
-
Participate in design reviews, threat modeling, and architectural decision-making, advocating for scalable, maintainable, and secure platform patterns.
-
Mentor mid-level engineers through code reviews, pairing sessions, and technical guidance, raising the engineering bar across the broader platform team.
Requirements
- 5+ years of professional software development experience, with demonstrated depth in backend platform or infrastructure engineering with proven experience designing and building distributed systems or platform-level services that serve multiple internal engineering teams.
- Hands-on experience with large language model (LLM) integration, including prompt engineering, model API consumption, and managing inference pipelines in production.
- Strong proficiency in Python and/or Java/Go, with demonstrated ability to engineer production-quality, maintainable, and well-tested code with a solid understanding of RESTful API design, event-driven architecture, and asynchronous processing patterns as they apply to AI platform services.
- Experience with major cloud platforms (AWS preferred) and the services relevant to AI/ML workloads - including managed compute, storage, and model serving infrastructure.
- Experience working with AI orchestration frameworks such as LangChain, LangGraph, LlamaIndex, or equivalent agentic tooling.
Preferred technical and professional experience
- Experience with MCP (Model Context Protocol) or A2A (Agent-to-Agent) protocol design and implementation within multi-agent AI systems.
- Hands-on experience with AWS Bedrock, Azure AI Foundry, or watsonx as a managed AI platform for model hosting, fine-tuning, or inference routing.
- Familiarity with LiteLLM, OpenRouter, or similar LLM proxy/routing layers for abstracting multi-model inference across providers.
- Experience with Snowflake, including Snowpark, Cortex AI features, or Time Travel, as part of a data platform or AI analytics workflow.
- Background in IBM enterprise platforms including Apptio, Cloudability, or IBM ContextForge, with awareness of how AI augments financial and cloud cost management use cases.
- Knowledge of AI governance, responsible AI practices, and security controls for AI systems - including data privacy, access control, and output guardrails.
- Experience with observability tooling applied to AI systems - including LLM evaluation frameworks, token cost tracking, latency profiling, and quality metrics pipelines.
- Exposure to AI compliance requirements (e.g., FIPS, SOC 2, FedRAMP) and how they shape platform architecture decisions in regulated enterprise environments.
- Contributions to open-source AI tooling, published technical writing, or demonstrated thought leadership in the generative AI or ML platform space.
- Experience building internal developer platforms (IDPs) or platform-as-product models where the primary customer is an internal engineering audience.