Senior Machine Learning Operations Engineer

Betmgm Llc

Jackson Township, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 170K

Job location

Jackson Township, United States of America

Tech stack

API

Artificial Intelligence

Amazon Web Services (AWS)

ARM

Continuous Integration

Information Engineering

Cursor (Graphical User Interface Elements)

Software Debugging

Distributed Systems

Fault Tolerance

Identity and Access Management

Python

Machine Learning

Performance Tuning

Role-Based Access Control

Azure

Software Systems

Data Streaming

Systems Integration

Management of Software Versions

Feature Engineering

GitHub Copilot

Large Language Models

Snowflake

Amazon Web Services (AWS)

Kubernetes

Information Technology

Apache Flink

Kafka

Machine Learning Operations

Api Gateway

Terraform

Virtual Private Clouds

Docker

Job description

The Senior MLOps Engineer treats ML systems as software systems and owns the path from a trained model to a production endpoint that meets its latency, cost, and reliability budgets - across both batch scoring (SageMaker Batch Transform, Snowflake Cortex / Snowpark ML, dbt-orchestrated scoring) and real-time inference (SageMaker real-time endpoints, Lambda + Bedrock, sub-second feature serving). The Senior Engineer builds the platform that data scientists and ML engineers ship on: feature store with guaranteed online/offline parity, model registry, CI/CD for ML, drift and quality monitoring, champion/challenger and shadow deployment scaffolding. This requires a software-engineering-first mindset - distributed systems, observability, and on-call instincts are the foundation; ML literacy makes them effective for this role. GenAI integration experience is a plus, not a requirement., ML Production Platform

Stand up and operate BetMGM's ML platform on AWS (SageMaker Training, Model Registry, Pipelines, Endpoints, Batch Transform) and Snowflake (Snowpark ML, Cortex), with Terraform-managed infrastructure.
Build self-service scaffolds that let data scientists ship a model end-to-end without a ticket queue - cookie-cutter project templates with CI, drift monitoring, alerting, IaC, and Snowflake connectivity pre-baked.

Batch and Real-Time Inference

Design and operate batch scoring pipelines - SageMaker Batch Transform, dbt-orchestrated scoring against Snowflake, Snowpark ML - with explicit freshness and cost SLAs.
Design and operate real-time inference paths - SageMaker real-time endpoints, Lambda + Bedrock for GenAI, API Gateway - with stated latency budgets (typically sub-100ms) and graceful degradation under load.
Own the feature store (SageMaker Feature Store, Tecton, or Feast) with guaranteed online/offline parity - training-serving skew is treated as an incident, not a tradeoff.

CI/CD and Deployment Patterns

Build CI/CD for ML - model registry, automated retraining triggers, model versioning, lineage from feature training run deployed model live prediction.
Implement champion/challenger, shadow deployments, and canary releases as platform primitives so individual model teams do not reinvent them per project.

Monitoring, Drift & Reliability

Stand up drift detection, data quality, and model performance monitoring (Evidently, Arize, or SageMaker Model Monitor - pick one and standardize) with paging that routes to humans who can fix it.
Own MLOps incident response - production model failures are SEV events with postmortems.

Cost and Performance

Right-size endpoints, batch caching, request batching, and autoscaling. State cost-per-prediction targets up front and meet them.

GenAI Integration (Plus, Not Required)

Integrate LLM APIs (Bedrock, Anthropic, OpenAI) into production paths - RAG pipelines, agent eval frameworks, prompt versioning, cost and latency observability.
Partner with the Helix team on AI personalization workloads as they ramp toward March Madness 2027.

AI in the Engineering Loop

Direct AI coding agents (Claude Code, Cursor, GitHub Copilot, dbt Copilot) as a force multiplier across infrastructure code, eval suites, and model-serving glue - designing work for agents to do, not just accepting their suggestions.

Collaboration

Partner with the data engineering team on shared standards (Terraform modules, CI/CD patterns, observability, lineage).
Work alongside data scientists and analytics partners to land the right interfaces between research and production - opinionated about the boundary.
Coordinate with Entain India and contractor ML partners as workloads consolidate onto the BetMGM-owned platform.

Requirements

Do you have experience in Virtual Private Clouds?, Do you have a Bachelor's degree?, * BS or MS in Computer Science, Math, Statistics, Machine Learning, or other STEM field - or equivalent practical experience. Practical experience wins ties; a PhD is neither required nor a tiebreaker.

Must-Haves

5+ years shipping software in production - Python, Docker, Kubernetes or ECS, CI/CD, distributed systems debugging - including time on-call.
3+ years operating ML in production - you have owned a model in prod that served real traffic, with stated latency and cost budgets and a runbook you wrote.
AWS depth across the SageMaker surface (Training, Endpoints, Batch Transform, Model Registry, Pipelines) plus the supporting cast (IAM, Lambda, ECS, S3, Secrets Manager, VPC).
Snowflake fluency - Snowpark ML, Cortex, dbt-orchestrated batch scoring, RBAC for ML workloads.
IaC for ML - Terraform + SageMaker Pipelines or equivalent. No manual console deployments to production.
Feature store experience - SageMaker Feature Store, Tecton, or Feast - with explicit ownership of online/offline parity.
Champion/challenger, shadow, and canary deployment patterns as production muscle, not blog-post familiarity.
Drift and model monitoring - Evidently, Arize, WhyLabs, or SageMaker Model Monitor - wired to a paging path.
Software-engineering-first mindset - you treat ML systems as systems, not notebooks.

Nice-to-Haves

GenAI in production - Bedrock, Anthropic, or OpenAI APIs integrated into live systems; RAG pipelines; vector DBs (Snowflake Cortex Search, pgvector , Pinecone); evaluation frameworks ( Langfuse or in-house).
Snowflake-native ML - Snowpark Container Services, Cortex AISQL, Cortex Agents - for workloads that do not need to leave the warehouse.
Streaming feature engineering - Kafka, Flink, or Snowpipe Streaming - for sub-second features.
Fine-tuning experience - LoRA , QLoRA , instruction tuning, eval-driven iteration - with an honest read on when fine-tuning beats prompting.
A track record of shipping more with AI in the engineering loop than without.
Regulated-industry experience (gaming, fintech, healthcare) - comfort with model risk, audit, and lineage requirements.

The annual salary range for this position is $135,000 to $170,000. Factors which may affect starting pay within this range may include geography/market, skills, education, experience and other qualifications of the successful candidate. This position is also eligible for participation in a performance-based bonus plan.

Applicants must possess legal authorization to work for our company in the U.S. without the need for immigration sponsorship. At this time, this role is not eligible for immigration-related employment authorization sponsorship including H-1B, O-1, E-3, TN, OPT, etc.

Benefits & conditions

(part of MGM Resorts International) 3.03.0 out of 5 stars New Jersey Hybrid work $135,000 - $170,000 a year - Full-time, Pulled from the full job description

Professional development assistance
Health insurance
401(k) matching
Paid time off
Vision insurance
Dental insurance
Flexible spending account, As a valued team member, we're committed to giving you the resources and support you need to thrive. Our benefits and perks include:
Medical, Dental, Vision, Life, and Disability Insurance
401(k) with company match
Pre-tax spending accounts including health care FSA and commuter savings
Flexible paid time off
Professional development reimbursement and ongoing skills training opportunities
Employee resource groups
Swag, ticket giveaways, and more!

At BetMGM, we recognize that every individual plays a meaningful role in our success. That's why we're committed to building a respectful, inclusive workplace. It's the strategy behind every win. By meeting people where they are, we create a culture of belonging where everyone can thrive and a workplace that reflects our values, our people, and our drive to win.

About the company

Ready to make your career legendary? Join us as we bring the magic of Vegas to our players. The BetMGM team has over 1,400 talented members, revolutionizing sports betting and online gaming in the United States and Canada. We're a brand with technology at our hearts and the most driven and focused talent in the business., As an online gaming company, BetMGM is required to comply with state gaming regulations which includes licensing obligations. Applicable employees must be licensed by at least one jurisdictional agency, although certain positions require licensing by multiple agencies. Failure to become licensed or maintain licensure with each agency as required for the role may result in termination of employment. Please note that the licensing process includes comprehensive background checks which may include a review of criminal records, financial history, and personal background verification. In addition, candidates must comply with and support BetMGM's responsible gambling policies, procedures, and initiatives. About BetMGM BetMGM is revolutionizing sports betting and online gaming in the United States and Canada. We are a partnership between two powerhouse organizations-MGM Resorts International and Entain Group. You know our name through our exciting portfolio of brands including BetMGM Casino, BetMGM Sportsbook, Borgata Online, Party Casino and Party Poker. We aim to bring our ideas into action and find ways to deliver the best quality in gaming platforms. BetMGM LLC is an Equal Opportunity Employer. We provide equal employment opportunities to all qualified individuals, regardless of race, religion, gender, gender identity, age, marital status, national origin, sexual orientation, citizenship status, veteran status, disability, or any other legally protected status. As an organization, we are unwavering in our commitment to maintaining a discrimination-free work environment, and fostering a culture of inclusivity, belonging and equal opportunity for all employees and applicants.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all