ML Infrastructure Architect

OpenKyber LLC

30 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Tech stack

Amazon Web Services (AWS)

Azure

Cloud Computing

Databases

Continuous Integration

DevOps

Python

NumPy

TensorFlow

Google Cloud Platform

PyTorch

Large Language Models

Prompt Engineering

Deep Learning

Cloudformation

Pandas

Containerization

Scikit Learn

Kubernetes

Information Technology

Low Latency

Machine Learning Operations

Terraform

Docker

Job description

Key Responsibilities Model Development: Design, train, and optimize ML models using frameworks like PyTorch or TensorFlow . GenAI Implementation: Lead the integration of LLMs, including fine-tuning, prompt engineering, and building RAG (Retrieval-Augmented Generation) pipelines. Infrastructure & Orchestration: Architect and maintain end-to-end ML pipelines (CI/CD for ML) using Docker , Kubernetes , and tools like MLflow or Kubeflow . Cloud Deployment: Deploy and manage production workloads on cloud platforms ( AWS/Google Cloud Platform/Azure ) with a focus on cost-efficiency and low latency. Monitoring & Governance: Implement robust monitoring for model drift, data quality, and performance metrics to ensure 24/7 reliability. Collaboration: Work closely with Data Scientists to productize research and with DevOps to align with enterprise security and infrastructure standards.

Requirements

Do you have experience in Pandas?, Do you have a Bachelor's degree?, Skill Matrix to be filled by Candidates: Mandatory Skills Years of Experience Year Last Used Rating Out of 10 End-to-End MLOps Automation GenAI Orchestration LLMOps Advanced Model Optimization & Inference, Technical Requirements Experience: 4+ years of hands-on experience in ML Engineering or MLOps roles. Core Stack: Expert-level proficiency in Python and standard ML libraries (Scikit-learn, Pandas, NumPy). Deep Learning: Strong experience with Transformers , CNNs, or RNNs. DevOps for ML: Mastery of containerization (Docker) and orchestration (K8s). Experience with Infrastructure as Code (Terraform/CloudFormation) is a major plus. GenAI Tools: Familiarity with LangChain, LlamaIndex, or Vector Databases (Pinecone, Milvus, Weaviate). Education: B.S./M.S. in Computer Science, Mathematics, or a related quantitative field.