Machine Learning Software Engineer
Role details
Job location
Tech stack
Job description
We are looking for a Senior AI Infrastructure/Machine Learning Operations Engineer who will be responsible for designing, implementing, and maintaining the infrastructure that supports our AI-based real-time speech enhancement systems training and evaluation, as well as the software abstraction for efficient embedded deployment. Your work will enable our AI engineers to focus on research by providing robust tools, scalable pipelines, and seamless integration with cloud platforms. You will collaborate closely with experts in machine learning, embedded systems, and software engineering to ensure efficient training, evaluation, and deployment of AI models for our hearing glasses., * Design and maintain scalable infrastructure for training, evaluating, and deploying AI models.
- Develop modular and reusable code abstractions to streamline experimentation and support rapid prototyping of new ideas.
- Review contributions from AI and embedded AI engineers to ensure code quality, adherence to coding standards, and maintainability of the codebase
- Build and optimize pipelines for data preprocessing, model training, hyperparameter tuning, and model evaluation.
- Implement tools for experiment tracking, model and data versioning, and performance benchmarking using local and cloud resources.
- Implement cloud-based solutions for storage, compute resources, and distributed training (e.g., Azure, OVHCloud).
- Set up MLOps practices including CI/CD pipelines for automated model testing, deployment, and monitoring, including on custom local runners with embedded devices.
- Collaborate with AI engineers and Embedded AI Engineers to deploy pre-trained models into production environments.
- Monitor infrastructure performance and implement improvements to ensure reliability and scalability.
- Stay up to date with advancements in machine learning infrastructure tools and technologies.
Requirements
- BS, or MS. in Computer Science, Engineering, or a related field, or equivalent experience.
- 8+ years of experience in engineering, including 4+ years of experience in building scalable machine learning infrastructure and MLOps pipelines.
- Proficiency in Bash scripting and strong experience working with Linux environments
- Proficiency in Python (for ML workflows) and familiarity with scripting tools for automation.
- Hands-on experience with cloud platforms like Azure, AWS, or OVHCloud, and experience with Docker for cloud deployments.
- Familiarity with experiment configuring and tracking tools like Hydra, Tensorboard and Weights&Biases and model and data version control tools like DVC and/or MLflow, etc.
- Strong understanding of PyTorch and PyTorch Lightning for model training and optimization.
- Experience with version control (Git), code reviews, and CI/CD pipelines.
- Experience optimizing training workflows for compact models (e.g., pruning, quantization) is a bonus.
- Background in audio-related machine learning tasks and/or signal processing is a bonus
- Familiarity with embedded systems and their constraints is a bonus.