AI Solution Architect

AgreeYa Solutions, Inc.

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Remote

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Application Layers

Azure

Performance Tuning

Azure

Search Technologies

Large Language Models

Model Validation

Machine Learning Operations

TensorRT

Nim (Programming Language)

GPT

Data Pipelines

Microservices

Job description

Lead AI discovery and use case prioritization, scoring opportunities on data sensitivity, cost at scale, latency, feasibility, and governance exposure. Design end-to-end architectures spanning RAG, agentic workflows, data pipelines, model serving, and guardrails. Make model-selection recommendations across closed (GPT, Claude, Gemini) and open-weight (Llama, Mistral, Qwen) options, applying a structured hard-attribute / soft-attribute framework. Choose the deployment target deliberately: Azure, AWS, on-prem NVIDIA AI factory, or hybrid, and document the rationale. Define interface specifications between AgreeYa's application layer and partner-owned infrastructure (for example, model-serving endpoint contracts, performance baselines), protecting against handoff and dependency risk. Own the application-level governance design: NIST AI RMF alignment, risk tiering, human-in-the-loop placement, audit and explainability requirements. Set delivery standards and review the work of AI Engineers and MLOps Engineers for architectural soundness.

Requirements

Demonstrated production AI/ML solution architecture, not only pilots and proofs of concept. Deep RAG fluency: chunking strategy, embedding model selection, vector search, retrieval evaluation. Working knowledge of agentic patterns, orchestration (LangChain / LangGraph), and tool integration (including MCP). Strong grasp of model selection, fine-tuning vs RAG trade-offs, and inference cost/latency economics. Able to lead technical client conversations and defend design decisions to a skeptical technical audience.

Must be able to architect and reason fluently across all three of the following, and recommend between them: Azure AI: Azure AI Foundry, Azure OpenAI Service, Azure AI Search, Azure ML. AWS AI: Amazon Bedrock, SageMaker, OpenSearch, Lambda-based serving. On-prem NVIDIA AI factory: NVIDIA AI Enterprise (NVAIE), NIM microservices, Triton Inference Server, NeMo and NeMo Guardrails, Run:ai, TensorRT-LLM, and quantized/air-gapped deployment (GGUF, vLLM).

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all