Senior AI Engineer
Role details
Job location
Tech stack
Job description
Genesis10 is seeking a Senior AI Engineer for our client in the Wealth Management industry. This Direct Hire Perm position is located in Plano, TX or Camus, WA. This role is mainly onsite., The AI Engineer will architect, implement production-ready generative AI systems, working closely with AI leads, ML engineers, and platform teams. You will develop solutions powered by LLMs, GPU frameworks, and scalable microservices, ensuring they are performant, secure, and seamlessly integrated with enterprise systems. You will report to the AI Team Leader, Technology Innovation., Design, develop, fine-tune, and deploy generative AI models into scalable production environments Build and maintain APIs and microservices using FastAPI to expose AI capabilities enterprise wide Collaborate with the AI Infrastructure team to architect robust LLM pipelines, including training workflows and retrieval-augmented generation (RAG) systems Integrate AI solutions into enterprise applications using secure, cloud-native architectures and best practices Ensure AI models are explainable, reliable, and compliant with regulatory and internal governance standards Continuously monitor and optimize model performance using evaluation frameworks, observability tools, and iterative fine-tuning
Requirements
10 years of IT industry experience 7 years of experience: Building datadriven software solutions Python expertise, with practical experience in LLMs, embeddings, and RAG architecture
3 years of hands-on AI development experience Demonstrated experience with generative AI models, including multimodal models Hands-on experience with cloud-native AI infrastructure: Azure Foundry, Kubernetes, Docker, vector databases, GPU clusters, and AI model governance frameworks Bachelor's degree in Computer Science, AI, or a related field (or equivalent professional experience)
Desired Skills: GPU-accelerated training and inference using NVIDIA technologies, including NIM and NeMo frameworks Optimize and scale AI models with NVIDIA NIM, and fine-tuning models with NeMo services Familiarity with AI agentic frameworks and deploying AI agents in production environments Deploy models with low latency and high throughput, using frameworks like vLLM and other GPU model deployment tools Hands-on experience with CI/CD pipelines for GenAI workflows, along with containerization and orchestration (Docker, Kubernetes Distributed training, GPU optimization, and large-scale model frameworks (CUDA, RAPIDS)