Machine Learning Operations Engineer
Role details
Job location
Tech stack
Job description
We are looking for an excellent ML Ops Engineer to join our research and development team.Key Responsibilities
This opportunity is to join the ML Operations teams which supports the ML Development team in building leading-edge motion capture products through provisioning and maintaining a modern ML Operations stack.This stack covers data acquisition pipelines, data management and ML model training infrastructure (SW and on-prem HW). We use both on-prem, self-managed systems and also leverage AWS infrastructure.You will have opportunities to guide the technical direction of the ML Ops team, suggest new areas of development and the potential to lead your own project.
Requirements
You will have relevant academic (research Masters level) and/or industry experience.
Essential SkillsExcellent knowledge and experience of managing an on-premise Kubernetes cluster.
Excellent knowledge of Kubeflow and similar systems, e.g. MLflow
Good programming ability in Python with familiarity with Linux systems including scripting and system configuration.
Experience using AWS, e.g, Cognito, S3, EC2, Lamdas, etc.
Experience with ML toolkits, e.g. PyTorch, Lightning, etc., along with a solid understanding of how these fit into ML Ops pipelines and tools.
Be able to design and implement MLOps solutions covering many different technologies.
Desirable SkillsBackground in DevOps with exposure to CI systems, e.g. Jenkins
Familiarity with infrastructure as code, e.g. Ansible
Experience, aptitude, and a desire to work with human motion capture, sport, animation tools and techniques.
Familiarity with C++.