Machine Learning (MLOps) Engineer

Cox powered by Atrium
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 190K

Job location

Tech stack

API
Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Azure
Cloud Computing
Information Engineering
DevOps
Distributed Systems
Github
Python
Machine Learning
TensorFlow
Prometheus
Software Deployment
Workflow Management Systems
Datadog
Data Logging
Google Cloud Platform
Cloud Platform System
PyTorch
Snowflake
Grafana
Multi-Agent Systems
Spark
Reliability of Systems
Generative AI
Cloudformation
Gitlab-ci
Scikit Learn
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Performance Monitor
Kafka
Machine Learning Operations
Hardware Infrastructure
Virtual Agents
Terraform
Docker
ELK
Jenkins
Microservices

Job description

Our client is seeking a highly skilled Machine Learning (MLOps) Engineer to support the deployment, automation, monitoring, and scalability of enterprise machine learning systems. This role will partner closely with Data Scientists, Software Engineers, DevOps teams, and business stakeholders to operationalize ML models in production environments. The ideal candidate has strong experience building CI/CD pipelines for ML workflows, managing cloud-native infrastructure, and supporting end-to-end machine learning lifecycle management., * Design, build, and maintain scalable MLOps platforms and infrastructure for machine learning model deployment and monitoring.

  • Develop and automate CI/CD pipelines for ML training, testing, validation, and production deployment.
  • Collaborate with Data Scientists and Engineering teams to productionize machine learning models and workflows.
  • Implement model versioning, experiment tracking, feature stores, and automated retraining pipelines.
  • Monitor model performance, drift detection, system reliability, and operational health across production environments.
  • Manage cloud infrastructure and containerized applications using Kubernetes, Docker, and Infrastructure-as-Code tools.
  • Optimize ML workflows for scalability, performance, security, and cost efficiency.
  • Support governance, compliance, and reproducibility standards for enterprise AI systems.
  • Troubleshoot infrastructure, deployment, and model performance issues across distributed systems.
  • Contribute to platform engineering best practices, automation strategies, and operational documentation.

Requirements

Do you have experience in System performance monitoring?, Do you have a Master's degree?, * 4+ years of experience in Machine Learning Engineering, MLOps, DevOps, or Platform Engineering roles.

  • Strong experience deploying and managing ML models in production environments.
  • Hands-on expertise with Python and ML frameworks such as TensorFlow, PyTorch, or Scikit-learn.
  • Experience building CI/CD pipelines using tools such as GitHub Actions, Jenkins, GitLab CI, or ArgoCD.
  • Proficiency with Docker, Kubernetes, and container orchestration platforms.
  • Strong cloud experience with AWS, Azure, or Google Cloud Platform.
  • Experience with ML lifecycle and orchestration tools such as MLflow, Kubeflow, SageMaker, Vertex AI, or Airflow.
  • Familiarity with Infrastructure-as-Code tools, including Terraform or CloudFormation.
  • Strong understanding of distributed systems, APIs, microservices, and production monitoring.
  • Experience with logging and observability tools such as Prometheus, Grafana, Datadog, or ELK Stack.
  • Strong communication and cross-functional collaboration skills.

Preferred Experience/Skills for the Machine Learning (MLOps) Engineer:

  • Experience supporting Generative AI, LLMOps, or Agentic AI platforms.
  • Familiarity with vector databases, RAG pipelines, and AI orchestration frameworks.
  • Experience working in highly regulated environments such as finance, healthcare, or enterprise SaaS.
  • Knowledge of data engineering technologies such as Spark, Kafka, or Snowflake.
  • Exposure to GPU infrastructure and model optimization techniques.
  • Experience implementing security and governance controls for AI/ML systems.
  • Kubernetes certifications or cloud platform certifications are preferred.

Education Requirements:

  • Bachelor's degree in Computer Science, Engineering, Data Science, Information Technology, or a related technical field is required.
  • Master's degree is preferred.

Benefits & conditions

3.93.9 out of 5 stars New York, NY 10261 $140,000 - $190,000 a year, Pulled from the full job description

  • Health insurance
  • 401(k) matching
  • Paid time off
  • Vision insurance
  • Dental insurance
  • Paid holidays, * Competitive salary and annual performance bonus.
  • Comprehensive medical, dental, and vision coverage.
  • 401(K) with company match.
  • Flexible PTO and paid holidays.
  • Hybrid work flexibility.
  • Professional development and certification reimbursement.
  • Employee wellness programs.
  • Access to cutting-edge AI/ML technologies and cloud platforms.
  • Collaborative and growth-oriented engineering culture.

Apply for this position