Senior Machine Learning Ops Engineer

Sheetz
Pittsburgh, United States of America
5 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 159K

Job location

Remote
Pittsburgh, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Big Data
Cloud Computing
Computer Engineering
Continuous Integration
DevOps
Monitoring of Systems
Machine Learning
Metadata
TensorFlow
Software Engineering
Software Requirements Analysis
Google Cloud Platform
Cloud Platform System
PyTorch
Facebook Flow
Containerization
Kubernetes
Information Technology
Production Code
Performance Monitor
Machine Learning Operations
Terraform
Software Version Control
Data Pipelines
Docker

Job description

A Senior Machine Learning Ops Engineer at Sheetz ensures that AI models move seamlessly from "working on a laptop" to running reliably across our stores, applications, and systems at scale. This role powers capabilities like smarter inventory management, enhanced customer experiences, and faster decision-making that keeps pace with the way Sheetz operates. The MLOps Engineer designs, builds, and maintains the pipelines, deployment processes, and monitoring systems that allow models to run continuously and perform consistently. Just as Sheetz kitchens operate around the clock to serve customers, this role keeps our AI systems running 24/7, using data as the ingredients and algorithms as the recipes that drive our technology.

This role qualifies for a remote work arrangement within our 7 state footprint (PA, OH, MI, WV, VA, MD, NC). OVERVIEW

Lead the design, deployment, and optimization of robust ML infrastructure and scalable pipelines that operationalize machine learning models at scale. Drive the adoption of ML Ops best practices across teams, ensure reproducibility and governance, and champion automation, reliability, and scalability throughout the ML lifecycle. Utilize advanced experience with orchestration frameworks, CI/CD workflows, cloud platforms, and model observability and partner cross-functionally with Data Science, Engineering, and DevOps teams to productionize ML capabilities and continuously enhance the organization's ML maturity. RESPONSIBILITIES (other duties may be assigned)

  1. Lead the end-to-end development and optimization of ML pipelines, including training, validation, deployment, monitoring, and retraining workflows at scale.

  2. Guide the use of and implement infrastructure for tools such as ML flow, TensorFlow, PyTorch, Docker, and Kubernetes to support scalable production workflows for model deployment and lifecycle management.

  3. Design and monitor tools for performance monitoring, drift detection, and automated alerting.

  4. Develop CI/CD pipelines to enable safe, rapid model iteration, deployment, and retraining across environments.

  5. Write, review, and maintain high-quality, production ready code, ensuring robust, reproducible, and secure ML systems.

  6. Apply advanced software engineering and ML Ops best practices to operationalize machine learning solutions efficiently and reliably.

  7. Collaborate with cross-functional teams to align ML solutions with business needs and system requirements and guide integration efforts to embed ML into production applications.

  8. Maintain thorough documentation, version control, metadata tracking, and lineage to support reproducibility and compliance of ML models.

  9. Recommend and implement improvements to ML infrastructure, frameworks, and operational standards, elevating the organization's ML maturity and capabilities

  10. Mentor and coach junior engineers, providing guidance on technical challenges, workflow design, and career development., 1. Lead the end-to-end development and optimization of ML pipelines, including training, validation, deployment, monitoring, and retraining workflows at scale.

  11. Guide the use of and implement infrastructure for tools such as ML flow, TensorFlow, PyTorch, Docker, and Kubernetes to support scalable production workflows for model deployment and lifecycle management.

  12. Design and monitor tools for performance monitoring, drift detection, and automated alerting.

  13. Develop CI/CD pipelines to enable safe, rapid model iteration, deployment, and retraining across environments.

  14. Write, review, and maintain high-quality, production ready code, ensuring robust, reproducible, and secure ML systems.

  15. Apply advanced software engineering and ML Ops best practices to operationalize machine learning solutions efficiently and reliably.

  16. Collaborate with cross-functional teams to align ML solutions with business needs and system requirements and guide integration efforts to embed ML into production applications.

  17. Maintain thorough documentation, version control, metadata tracking, and lineage to support reproducibility and compliance of ML models.

  18. Recommend and implement improvements to ML infrastructure, frameworks, and operational standards, elevating the organization's ML maturity and capabilities

  19. Mentor and coach junior engineers, providing guidance on technical challenges, workflow design, and career development.

  20. Lead the end-to-end development and optimization of ML pipelines, including training, validation, deployment, monitoring, and retraining workflows at scale.

  21. Guide the use of and implement infrastructure for tools such as ML flow, TensorFlow, PyTorch, Docker, and Kubernetes to support scalable production workflows for model deployment and lifecycle management.

  22. Design and monitor tools for performance monitoring, drift detection, and automated alerting.

  23. Develop CI/CD pipelines to enable safe, rapid model iteration, deployment, and retraining across environments.

  24. Write, review, and maintain high-quality, production ready code, ensuring robust, reproducible, and secure ML systems.

  25. Apply advanced software engineering and ML Ops best practices to operationalize machine learning solutions efficiently and reliably.

  26. Collaborate with cross-functional teams to align ML solutions with business needs and system requirements and guide integration efforts to embed ML into production applications.

  27. Maintain thorough documentation, version control, metadata tracking, and lineage to support reproducibility and compliance of ML models.

  28. Recommend and implement improvements to ML infrastructure, frameworks, and operational standards, elevating the organization's ML maturity and capabilities

  29. Mentor and coach junior engineers, providing guidance on technical challenges, workflow design, and career development.

Requirements

(Equivalent combinations of education, licenses, certifications and/or experience may be considered), * Bachelor's degree in Computer Science, Management Information Systems, Computer Engineering, or related discipline is required, * Minimum 5 years hands-on experience in designing, developing, and operationalizing machine learning solutions, with a strong focus on ML Ops practices and infrastructure is required

  • Previous experience working with large databases - both structured and unstructured - to build data pipelines and self-service dashboards for business users required
  • Previous experience in managing machine learning pipelines, lifecycle management, and deployment at scale-including training, validation, serving, and monitoring required
  • Previous experience with CI/CD pipelines for ML workflows and containerization tools such as Docker and Kubernetes preferred
  • Previous experience with secure and scalable cloud environments (e.g., AWS, GCP, Azure) and infrastructure-as-code and platform-as-a-service (PaaS) offerings preferred

Licenses/Certifications

  • Cloud Platforms (AWS, GCP, Azure) preferred
  • MLOps tools and framweorks (e.g., ML Flow, Kubeflow, TFX) preferred
  • DevOps certifications (e.g. Docker, Kubernetes, Terraform, CI/CD Tools) preferred

Tools & Equipment

  • General Office Equipment ACCOMMODATIONS, * Minimum 5 years hands-on experience in designing, developing, and operationalizing machine learning solutions, with a strong focus on ML Ops practices and infrastructure is required

  • Previous experience working with large databases - both structured and unstructured - to build data pipelines and self-service dashboards for business users required

  • Previous experience in managing machine learning pipelines, lifecycle management, and deployment at scale-including training, validation, serving, and monitoring required

  • Previous experience with CI/CD pipelines for ML workflows and containerization tools such as Docker and Kubernetes preferred

  • Previous experience with secure and scalable cloud environments (e.g., AWS, GCP, Azure) and infrastructure-as-code and platform-as-a-service (PaaS) offerings preferred

Licenses/Certifications

  • Cloud Platforms (AWS, GCP, Azure) preferred
  • MLOps tools and framweorks (e.g., ML Flow, Kubeflow, TFX) preferred
  • DevOps certifications (e.g. Docker, Kubernetes, Terraform, CI/CD Tools) preferred

Benefits & conditions

This position offers a base salary range of $95,351 - $158,922 per year, depending on experience and qualifications, plus bonus based on company performance.

One of the MANY work perkz at Sheetz is quarterly employee bonuses based on company performance! And there's more - A LOT more… like competitive salaries, PTO and parental leave, 401k match and employee stock ownership, limitless professional development and growth opportunities, tuition reimbursement, full medical, vision and dental coverage, and snack discounts!

Apply for this position