Software Engineer, Applied ML - 2025 New
Role details
Job location
Tech stack
Job description
Join Brave's mission to revolutionize web browsing through AI. We're looking for an experienced ML Engineer to build next-generation features that serve nearly 100 million users worldwide. You'll work with state-of-the-art language models, collaborating across teams to ship innovative AI capabilities that make the browser smarter and more capable-all while maintaining our privacy-first principles., * Evaluate, integrate, and deploy state-of-the-art language models for Leo and other browser AI capabilities, including both cloud-based and on-device deployment scenarios
- Design, optimize, and maintain ML inference pipelines for browser-integrated AI features, with focus on reducing deployment costs and improving model performance
- Develop and train custom ML models for browser-specific use cases such as content classification and search optimization using techniques like LoRA and DPO, including distributed training setups
- Generate synthetic data for training data augmentation and model evaluation
- Collaborate with browser engineering teams to seamlessly integrate AI capabilities into core product features while maintaining performance and privacy standards
- Collaborate with product and design teams to define, prototype, and ship new AI-powered features including text-to-speech, image generation, and enhanced tool calling capabilities
- Implement and optimize model serving infrastructure using frameworks like vLLM, ONNX Runtime, and Nvidia Triton to achieve production-scale performance requirements
- Collaborate with DevOps teams on MLOps infrastructure including model monitoring, load testing, caching optimization, and automated CI/CD pipelines for model deployments
- Contribute to privacy-preserving ML approaches and on-device model implementations that align with Brave's privacy-first mission
Requirements
Machine Learning, Pipeline Development, Tgi, Computer Science, Kubernetes, Scratch, Production Experience, Load Testing, Performance Tuning, * 2 to 5 years of experience optimizing and deploying ML models in production environments
- Strong software engineering background with production experience
- Extensive experience with PyTorch or other modern ML frameworks
- Experience training custom models from scratch
- Experience with model optimization and inference frameworks (e.g., vLLM, ONNX Runtime, Nvidia Triton)
- Familiarity with MLOps practices & Kubernetes and ability to collaborate with DevOps teams on model monitoring, load testing, and CI/CD pipelines
- Experience shipping ML-powered features or systems (consumer applications preferred), * Master's degree in Computer Science, Machine Learning, or related field
- Familiarity with LLM serving frameworks (vLLM, TGI, Ray Serve) and GPU optimization
- Experience with embeddings, vector databases, semantic search implementations, model training workflows, and data pipeline development
- Experience integrating LLMs with tool calling/MCP
- Knowledge of privacy-preserving ML techniques and on-device model deployment
- Previous work on cost optimization and performance tuning of ML systems at scale