Lead Software Engineer - Python
Role details
Job location
Tech stack
Job description
Are you passionate about building innovative technology that powers AI and machine learning across a global organization? As part of our team, you'll help shape the future of model deployment at scale, collaborating with talented engineers and data scientists. You'll have the opportunity to work on impactful projects, grow your skills, and contribute to a platform that drives real business outcomes. We value creativity, collaboration, and a commitment to excellence., As a Software Engineer in the Firmwide AI/ML Deployment Platform team, you will design and develop cloud-native solutions that support model deployment across the organization. You will work closely with data scientists and engineers to deliver features that streamline production workflows. Your contributions will help scale our platform to meet the needs of diverse internal clients, ensuring reliability and innovation. You will be part of a collaborative environment focused on continuous improvement and technical excellence., * Build and deploy infrastructure solutions for seamless integration of control plane and client accounts
- Develop and implement APIs for platform functionalities such as automated retraining, scheduling, endpoint deployments, and autoscaling
- Design robust features to support a growing internal customer base, including multi-region and disaster recovery capabilities
- Architect and implement model monitoring solutions, with emphasis on LLM monitoring and automated issue correction
- Engage with clients to identify strategic solutions and provide deployment and debugging support
- Assist in implementing platform capabilities aligned with product requirements
- Deploy infrastructure and develop managed environments for platform operations
Requirements
- Knowledge of AWS services and cloud-based infrastructure
- Experience building resilient software platforms
- Proficiency in architecting software solutions at scale
- Ability to design solutions with strategic insight
- Proficiency in Python
Preferred Qualifications, Capabilities, and Skills:
- Familiarity with monitoring tools, especially for AI/ML model monitoring
- Proficiency in Golang
- Experience with AWS Sagemaker for model training and deployment
- Familiarity with Kubernetes and managing deployments to EKS
- Knowledge of networking concepts such as Virtual Private Clouds and DNS
- Experience working with LLMs
- Experience with Terraform or other Infrastructure as Code tools
- Experience in API development and design