Senior Software Engineer - AI Platform Engineering
Role details
Job location
Tech stack
Job description
As an AI Platform Engineer, you'll be part of the team designing and running the core AI platform. You'll work on APIs, pipelines, observability, security, and orchestration, ensuring our AI solutions can move from experiment to production smoothly. This role is about building the foundations of AI adoption - if you enjoy combining distributed systems, cloud engineering, and AI tooling into something bigger, this is it.
We are looking for someone who takes pride & ownership in what they build, being able to voice their opinion and share their expertise, but also listen and make the correct decision. They shouldn't be afraid to celebrate their successes, admit their mistakes and turn to others for help, maintaining a sense of honesty and humility. They should be looking to improve each day, caring about the tech, architecture, and people they interact with, understanding both the small details and the big picture in everything we do. Finally, they should be able to coach, mentor, and inspire those around them, embedding excellence, a sense of safety, and a desire to succeed in their teams, ensuring these values are adhered to at all levels.
What you will do…
- Design and build the AI platform that powers LLMs, agents, and other AI solutions across Dojo.
- Develop APIs, SDKs, and tooling that allow product teams to consume AI capabilities at scale while having great developer experience
- Implement orchestration for multi-model and multi-service workflows (e.g., LangGraph, Crew AI , Google Agent Development Kit, agentic frameworks).
- Build and Manage vector search and retrieval systems to support RAG and knowledge integration.
- Build robust monitoring, logging, and guardrails to ensure AI systems are safe, observable, and compliant using solutions like Langsmith , Opik, Prometheus and Grafana.
- Automate infrastructure and model deployment with Kubernetes, Terraform, and CI/CD pipelines.
- Partner with security, compliance, and product to ensure safe use of AI in production.
- Stay on top of AI platform trends, open-source tools, and emerging patterns - bringing best practices into our stack.
Requirements
- Strong software engineering or Platform engineering background (Python/GO/Java; Go/Java/.NET a bonus).
- Solid experience with distributed systems, microservices, and cloud-native architecture (GCP preferred).
- Hands-on experience with Kubernetes, service mesh, and event-driven systems.
- Familiarity with LLM orchestration frameworks (LangChain, LangGraph, CrewAI, GCP ADK or similar).
- Experience with vector databases (FAISS, Pinecone, Weaviate, Vertex Vector Search) and RAG pipelines.
- Knowledge of MLOps/AI infra tools (MLflow, VertexAI, Ollama, OpenRouter etc).
- Strong CI/CD and infrastructure-as-code skills (Terraform, Helm, etc.).
- Good understanding of AI governance, monitoring, and responsible AI practices.
- Comfort balancing speed (PoCs) with robustness (production-ready systems).