Forward Deployed Engineer (Generative AI)
Role details
Job location
Tech stack
Job description
The Forward Deployed Engineer (FDE) drives the on-site deployment, integration, and scaling of our enterprise Generative AI solutions. This role embeds directly within customer engineering teams to operationalize Large Language Models (LLMs) and retrieval systems across multi-cloud environments (AWS, Azure, GCP). You will bridge the gap between AI research and production-grade cloud infrastructure.
You will collaborate with cross-functional teams and business partners and will have the opportunity to drive current and future strategy by leveraging your analytical skills as you ensure business value and communicate the results., * AI Solution Deployment: Deploy, fine-tune, and optimize large-scale Gen AI models and LLM orchestration frameworks within customer cloud environments.
- Infrastructure Engineering: Architect scalable infrastructure for AI workloads utilizing GPU/TPU orchestration, high-performance storage, and low-latency networking.
- Data & Retrieval Pipelines: Design and implement high-throughput data ingestion pipelines and Vector Database architectures for Retrieval-Augmented Generation (RAG).
- Multi-Cloud Management: Build agnostic, resilient cloud deployments across AWS, Azure, and GCP using Infrastructure as Code (IaC).
- Technical Advocacy: Act as the primary technical consultant, guiding enterprise clients through AI safety, prompt engineering patterns, and inference cost optimization.
- Product Collaboration: Feed edge-case deployment insights back to core AI research and platform engineering teams to improve product robustness.
Requirements
Do you have experience in Software coding?, * AI Frameworks: Hands-on experience with LLM orchestration tools (LangChain, LlamaIndex, AutoGen) and deep learning frameworks (PyTorch, Hugging Face).
- Vector Databases: Production experience setting up and querying vector stores (Milvus, Pinecone, Qdrant, Chroma, or pgvector).
- Model Operations (LLMOps): Proficiency in model serving frameworks (vLLM, TGI, Triton Inference Server) and evaluation tools.
- Cloud & Containers: Advanced knowledge of cloud AI primitives (AWS Bedrock/SageMaker, Azure OpenAI, GCP Vertex AI) and Kubernetes (K8s) for GPU workloads.
- IaC & Automation: Mastery of Terraform or OpenTofu to provision complex multi-cloud compute environments.
- Programming: Strong coding skills in Python (preferred) or Go, with an emphasis on writing clean, concurrent code.
Soft Skills-
- AI Consultation: Ability to manage customer expectations around LLM non-determinism, hallucinations, and performance trade-offs.
- Rapid Adaptability: Passion for keeping pace with the weekly advancements in the Generative AI landscape.
- Critical Debugging: Exceptional skill in isolating errors across complex software layers, from GPU drivers up to prompt engineering logic.
- Mobility: Willingness to travel to client sites to lead high-stakes, on-site deployment sprints.
Benefits & conditions
This position offers an excellent opportunity for significant career development in a fast-growing and challenging entrepreneurial environment with a high degree of individual responsibility.