MLOps Engineer - AI/ML Systems & Deployment (TS/SCI Preferred)

Rackner, Inc.
Dayton, United States of America
2 months ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Dayton, United States of America

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Computer Vision
Azure
Cloud Computing
Computer Programming
Data Cleansing
Data Files
Distributed Systems
Revision Control Systems
Python
Machine Learning
Metadata Standards
Prometheus
Runbook
Management of Software Versions
Feature Engineering
Delivery Pipeline
Large Language Models
Grafana
SC Clearance
Kubernetes
Operational Systems
Machine Learning Operations
Software Version Control
Docker

Job description

Rackner is seeking an MLOps Engineer to support the deployment and lifecycle management of AI/ML systems within a secure, mission-focused environment.

This role is responsible for operationalizing machine learning capabilities-moving models from experimentation into reliable, deployable, and auditable systems.

You will work across:

machine learning cloud-native infrastructure distributed systems

...to ensure AI/ML systems are production-ready in environments where reliability, performance, and security are critical.

Responsibilities Build and maintain production ML pipelines using tools such as Kubeflow, Airflow, or Argo Deploy ML models into secure and constrained environments (including on-prem, air-gapped, or hybrid systems) Implement model versioning, reproducibility, and lifecycle management (MLflow, ClearML) Develop and operate containerized ML workloads using Docker and Kubernetes Design and support model serving architectures (batch and real-time inference) Monitor system and model performance using Prometheus, Grafana, OpenTelemetry Support data preparation, feature engineering, and dataset versioning (lakeFS or similar) Create technical documentation, runbooks, and operational standards Collaborate with cross-functional teams to ensure successful integration into operational systems

Requirements

U.S. Citizenship (required for clearance eligibility) Experience deploying ML systems into production environments Strong programming skills in Python Experience with Kubernetes and containerized systems (Docker)

Hands-on experience with: ML pipeline tools (Kubeflow, Airflow, Argo) Model tracking/versioning tools (MLflow, ClearML)

Understanding of distributed systems and scalable architectures Experience with cloud platforms (AWS, Azure, or GCP), Active TS/SCI clearance Experience with LLMs, transformer-based models, or computer vision systems Familiarity with model serving frameworks and inference optimization Experience working in regulated, defense, or mission-critical environments Exposure to data versioning tools (lakeFS) and metadata standards Experience supporting systems in air-gapped or secure environments

Clearance Requirements Active TS/SCI clearance strongly preferred Candidates with an active Secret clearance may be considered and supported for upgrade Candidates without an active clearance must be: U.S. citizens eligible to obtain and maintain a clearance able to work in a CAC-enabled or secure environment

Benefits & conditions

Work on AI/ML systems that are deployed and used in real-world environments Build systems that prioritize reliability, reproducibility, and operational impact Gain experience operating within secure, high-trust environments Collaborate on modern MLOps, DevSecOps, and cloud-native architectures

About Rackner Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. We specialize in:

cloud-native development DevSecOps AI/ML systems distributed architecture

Our approach is cloud-first, cost-effective, and outcome-driven, delivering scalable and resilient systems., 401(k) with 100% match up to 6% Comprehensive Medical, Dental, Vision coverage Life Insurance + Short & Long-Term Disability Generous PTO Weekly pay schedule Home office & equipment support Certification and training reimbursement

Apply If you're an engineer who wants to move from building models * owning production systems, we'd like to connect: https://grnh.se/71n3dndw5us

MLOps, Machine Learning Operations, Kubernetes, Docker, Kubeflow, MLflow, Airflow, Argo Workflows, Python, AI/ML, Model Deployment, Model Serving, DevSecOps, Cloud, TS/SCI, Clearance

Apply for this position