SME Machine Learning Ops Engineer
Role details
Job location
Tech stack
Job description
- Lead the integration and deployment of trained AI/ML models into production environments (e.g., cloud, edge devices) using MLOps best practices
- Develop and optimize model training & inference pipelines for real-time execution, and efficiently handle large-scale data processing
- Work with data science teams to structure automated ML model health monitoring and refresh capabilities
- Implement continuous integration, delivery and training (CI/CD/CT) workflows with commercial and open-source modeling platforms/services
- Coordinate with Data Science and Engineering teams to build scalable feature stores for optimal model training & execution workflows
- Research, evaluate and recommend new tools, applications, software packages for MLOps engineering that can be adopted and approved for use in the CBP environment
- Collaborate with cross-functional teams (e.g., Software Engineering, Data Science) to integrate and test multiple candidate AI/ML models and applications for operational assessment
Requirements
Adtech seeks a motivated, career and customer-oriented SME Machine Learning Ops Engineer. This is currently a hybrid position with two days onsite in Ashburn, VA and three days remote.
In this role, you will collaborate within a cross-functional team to develop new Artificial Intelligence/Machine Learning (AI/ML) based solutions into operational pipelines to deliver mission impact for U.S. Customs and Border Protection (CBP). The ideal candidate will have deep expertise and experience with predictive modeling lifecycles, hands-on experience with machine learning tools and frameworks, and a pragmatic, customer-centric approach to applying ML models to solve complex problems., * HS Diploma/GED and 20+ years of experience, AS/AA and 18+ years, BS/BA and 12+ years, MS/MA/MBA and 9+ years, or PhD/Doctorate and 7+ years
- Expertise with MLOps tools and frameworks such as Mlflow, Kubeflow, Airflow and implementing monitoring/drift detection capabilities (e.g. Alibi, Grafana)
- Experience with ML platforms, such as AWS Sagemaker, DataBricks or DataRobot
- Experience automating workflow orchestration to handle both batch and real-time streaming data processing for model inference
- Hands-on experience productionizing models, including experience optimizing for inference speed, containerization (e.g., Docker), and with multi-cloud deployment platforms (e.g., AWS, Azure, Google Cloud Platform)
- Proficiency in Python, Scala and Java with strong understanding of high-performance computing and GPU acceleration
- Hands-on experience with Big Data tools (e.g. Spark, Hadoop, Kafka), * Experience with MLOps principles and tools for automated model training, testing, deployment and monitoring
- Strong communication skills with the ability to collaborate effectively across Data Science, Data Engineering, and DevSecOps teams
- Experience with data engineering Extract, Transform and Load (ETL) workflows across various relational/non-relational databases (Oracle/Postgres, MongoDB) and cloud endpoint services e.g. (Lambda, GraphQL etc.)
- Experience in using deep learning frameworks (PyTorch, TensorFlow, Keras) and computer vision libraries (OpenCV, SimpleITK, ITKm VTK)
- Experience with biometric or image recognition algorithms and associated predictive analytics pipelines
- Experience with GPU-based infrastructure and performance optimization