Senior MLOps & Generative AI Engineer - Remote
Role details
Job location
Tech stack
Job description
As a Senior Engineer, you will partner closely with AI Scientists, Data Engineers, Software Engineers, Architects, and Product teams to operationalize AI/ML and Generative AI solutions at enterprise scale. You will play a key role in shaping the organization's AI platform strategy, driving best practices, and delivering scalable, secure, and reliable AI systems in production healthcare environments., MLOps Engineering Responsibilities
- Design, build, and maintain scalable ML infrastructure and pipelines supporting model training, deployment, monitoring, governance, and lifecycle management.
- Develop and optimize CI/CD pipelines for machine learning and AI workloads across development, staging, and production environments.
- Build reusable ML platform capabilities including feature stores, model registries, experimentation frameworks, artifact management, and deployment automation.
- Implement scalable orchestration and workflow solutions for batch and real-time ML inference workloads.
- Create robust monitoring systems to measure model performance, detect model drift, monitor data quality, and ensure production reliability.
- Develop automation tools and self-service capabilities to improve the efficiency, scalability, and reliability of MLOps processes.
- Collaborate with Data Scientists and Software Engineers to streamline the ML lifecycle from experimentation through enterprise production deployment.
- Apply software engineering best practices to AI/ML systems including testing, observability, resiliency, security, versioning, and infrastructure-as-code.
- Identify gaps and improvement opportunities within the organization's ML platform ecosystem and architect scalable solutions to address them.
- Support enterprise AI governance, compliance, auditability, and model risk management requirements.
- Ensure platform scalability, reliability, security, and operational excellence across AI/ML systems.
Generative AI Engineering Responsibilities
- Lead the architecture, design, and deployment of enterprise Generative AI solutions leveraging LLMs, foundation models, and agentic AI systems.
- Design and implement Retrieval-Augmented Generation (RAG) pipelines using vector databases, embeddings, semantic search, reranking, and retrieval optimization strategies.
- Build scalable LLM orchestration frameworks using technologies such as LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks.
- Develop advanced prompt engineering strategies, prompt chaining, context management, and agent workflows to improve LLM accuracy and reliability.
- Evaluate and implement fine-tuning, parameter-efficient tuning, and prompt-based optimization approaches for domain-specific use cases.
- Build AI evaluation and benchmarking frameworks to measure hallucination rates, response quality, grounding accuracy, toxicity, bias, latency, and business performance metrics.
- Implement AI safety guardrails, governance controls, content filtering, and responsible AI practices for enterprise healthcare environments.
- Design scalable GenAI APIs and microservices supporting high-throughput enterprise AI applications.
- Optimize GenAI systems for cost, latency, throughput, and inference performance across cloud and hybrid environments.
- Integrate enterprise data sources, healthcare systems, and knowledge repositories into secure GenAI workflows.
- Research and evaluate emerging GenAI technologies, open-source frameworks, and foundation models to drive innovation and continuous improvement.
- Develop architecture diagrams, technical roadmaps, implementation strategies, and executive-level documentation for enterprise AI initiatives.
- Collaborate with cybersecurity, compliance, and infrastructure teams to ensure secure and compliant deployment of GenAI solutions involving PHI and sensitive healthcare data.
- Contribute to the development of AI platform standards, reusable GenAI accelerators, templates, and engineering best practices.
Requirements
- 5+ years of experience building and deploying production software, ML systems, or AI platforms.
- 1+ years of hands-on experience building production Generative AI or LLM-based applications.
- Strong programming skills in Python and experience with software engineering best practices.
- Experience with major deep learning and LLM frameworks such as PyTorch, Hugging Face Transformers, TensorFlow, or equivalent.
- Hands-on experience implementing RAG architectures, vector search, embeddings, prompt engineering, and LLM orchestration frameworks.
- Experience with vector databases such as Pinecone, Weaviate, Chroma, FAISS, Milvus, or equivalent technologies.
- Experience deploying AI/ML systems in cloud environments including AWS, Azure, or GCP.
- Strong understanding of APIs, distributed systems, microservices, and scalable backend architectures.
- Experience with Kubernetes, containerization, orchestration, and cloud-native infrastructure.
- Experience implementing CI/CD pipelines, infrastructure automation, and MLOps best practices.
- Experience building monitoring, observability, and alerting solutions for ML and AI systems.
- Strong understanding of AI/ML lifecycle management, governance, model versioning, and production operations.
- Experience designing secure, scalable, production-ready AI platforms and services.
- Strong communication and collaboration skills with the ability to work across technical and business teams., * Previous experience implementing Generative AI and MLOps solutions within healthcare environments.
- Experience working with EPIC or healthcare interoperability platforms.
- Understanding of HIPAA, PHI handling, healthcare compliance, and responsible AI practices.
- Experience with AI governance frameworks, LLM evaluation methodologies, and AI safety tooling.
- Experience with GPU infrastructure optimization and scalable inference architectures.
- Familiarity with multi-agent AI systems and autonomous workflows.
- Experience with event-driven architectures, streaming pipelines, and real-time inference systems.
- Exposure to model fine-tuning techniques including LoRA, PEFT, RLHF, or domain adaptation strategies.
- Experience with enterprise AI platform architecture and internal developer platforms.
- Prior experience mentoring engineers and leading technical initiatives., * 5+ years of relevant experience with a degree (Required)
or
-
7+ years of relevant experience without a degree (Required)
-
Experience in lieu of Bachelor's Degree.
Certification/Licensure
- No specific certification or licensure requirements, * 5 to 7 years of relevant experience
Benefits & conditions
We provide market-competitive compensation packages, inclusive of base pay, incentives, and benefits. The base pay rate for Full Time employment is: $91,416.00 - $152,380.80. Additional compensation may be available for this role such as shift differentials, standby/on-call, overtime, premiums, extra shift incentives, or bonus opportunities.
Keywords: Talroo-IT, MLOps, Gen AI, LLM, AWS, Azure, GCP, AI/ML, Python, PyTorch, Hugging Face Transformers, TensorFlow, RAG, EPIC, HIPAA, AI Governance