Senior AI Deployment Engineer
Role details
Job location
Tech stack
Job description
As a Senior AI Deployment Engineer, you will work along with a team of AI scientists, AI and xR application engineers, software engineers and clinical experts to design, develop, and deploy computer vision, augmented reality, mixed reality, multi-modal AI models and GenAI features into existing and new medical device products. This is a unique, high visibility opportunity for talented individuals who want to dive deep into cutting-edge AI and GenAI optimization for cloud and edge deployments., * Develop and deploy Artificial Intelligence (AI) powered software on cloud and edge devices (iPhone, iPad, Vision Pro, Android devices, NVidia devices).
- Optimize AI/ML models and pipelines for real-time inference on edge devices.
- Build containerized AI services (e.g. Docker) and orchestrate deployments.
- Deploy and maintain real-time microservices for AI applications including GenAI apps.
- Work with MLOps and AI security platform team to support continuous integration, testing, and monitoring of AI models.
- Design deployment evaluation frameworks, develop unit tests for software components in compliance with regulatory requirements.
- Generate and review the necessary documents with project teams.
- Perform software verification and/or validation testing.
- Perform code reviews as an independent reviewer following best coding standards and practices.
Requirements
- Bachelor's degree in software engineering/Computer Science or related discipline with 2+ years of relevant work experience OR Master's in relevant disciplines OR PhD degree in relevant disciplines.
- At least 4+ years of Python and C++ development experience.
- 3+ years of experience developing and deploying AI/ML models into production environments.
- Proficiency in containerization tools (Docker, Docker Compose).
- Experience with CI/CD Git automation pipelines.
- Strong Cloud deployment experience (Azure, GCP or AWS).
- Proven ability to optimize AI models for real-time inference on edge devices to meet latency and performance requirements.
- Knowledge of model optimization techniques (ONNX Runtime, quantization, pruning, TensorRT, OpenVINO, CoreML etc.).
- Experience with voice, LLMs and Generative AI (LLMs, vision, multimodal, multiagent) microservices experience.
- Experience with multi-agentic AI frameworks for orchestrating complex workflows.
- Experience with Kubernetes for orchestrating AI microservices.
- Familiarity with monitoring and logging (Prometheus, Grafana, Azure Monitor, etc.).
- Experience with medical devices and product development standards in a regulated environment (ISO 13485, IEC 62304, ISO 14971).