AI Solution Architect

AgreeYa Solutions, Inc.
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Application Layers
Azure
Performance Tuning
Azure
Search Technologies
Large Language Models
Model Validation
Machine Learning Operations
TensorRT
Nim (Programming Language)
GPT
Data Pipelines
Microservices

Job description

Lead AI discovery and use case prioritization, scoring opportunities on data sensitivity, cost at scale, latency, feasibility, and governance exposure. Design end-to-end architectures spanning RAG, agentic workflows, data pipelines, model serving, and guardrails. Make model-selection recommendations across closed (GPT, Claude, Gemini) and open-weight (Llama, Mistral, Qwen) options, applying a structured hard-attribute / soft-attribute framework. Choose the deployment target deliberately: Azure, AWS, on-prem NVIDIA AI factory, or hybrid, and document the rationale. Define interface specifications between AgreeYa's application layer and partner-owned infrastructure (for example, model-serving endpoint contracts, performance baselines), protecting against handoff and dependency risk. Own the application-level governance design: NIST AI RMF alignment, risk tiering, human-in-the-loop placement, audit and explainability requirements. Set delivery standards and review the work of AI Engineers and MLOps Engineers for architectural soundness.

Requirements

Demonstrated production AI/ML solution architecture, not only pilots and proofs of concept. Deep RAG fluency: chunking strategy, embedding model selection, vector search, retrieval evaluation. Working knowledge of agentic patterns, orchestration (LangChain / LangGraph), and tool integration (including MCP). Strong grasp of model selection, fine-tuning vs RAG trade-offs, and inference cost/latency economics. Able to lead technical client conversations and defend design decisions to a skeptical technical audience.

Must be able to architect and reason fluently across all three of the following, and recommend between them: Azure AI: Azure AI Foundry, Azure OpenAI Service, Azure AI Search, Azure ML. AWS AI: Amazon Bedrock, SageMaker, OpenSearch, Lambda-based serving. On-prem NVIDIA AI factory: NVIDIA AI Enterprise (NVAIE), NIM microservices, Triton Inference Server, NeMo and NeMo Guardrails, Run:ai, TensorRT-LLM, and quantized/air-gapped deployment (GGUF, vLLM).

Apply for this position