Software Engineer, Applied ML - 2025 New

Brave Software
Charing Cross, United Kingdom
23 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Remote
Charing Cross, United Kingdom

Tech stack

Training Data
Artificial Intelligence
Databases
DevOps
Web Browsers
Load Testing
Machine Learning
Performance Tuning
Software Engineering
PyTorch
Large Language Models
Kubernetes
Information Technology
Machine Learning Operations
Speech Synthesis
Data Pipelines

Job description

Join Brave's mission to revolutionize web browsing through AI. We're looking for an experienced ML Engineer to build next-generation features that serve nearly 100 million users worldwide. You'll work with state-of-the-art language models, collaborating across teams to ship innovative AI capabilities that make the browser smarter and more capable-all while maintaining our privacy-first principles., * Evaluate, integrate, and deploy state-of-the-art language models for Leo and other browser AI capabilities, including both cloud-based and on-device deployment scenarios

  • Design, optimize, and maintain ML inference pipelines for browser-integrated AI features, with focus on reducing deployment costs and improving model performance
  • Develop and train custom ML models for browser-specific use cases such as content classification and search optimization using techniques like LoRA and DPO, including distributed training setups
  • Generate synthetic data for training data augmentation and model evaluation
  • Collaborate with browser engineering teams to seamlessly integrate AI capabilities into core product features while maintaining performance and privacy standards
  • Collaborate with product and design teams to define, prototype, and ship new AI-powered features including text-to-speech, image generation, and enhanced tool calling capabilities
  • Implement and optimize model serving infrastructure using frameworks like vLLM, ONNX Runtime, and Nvidia Triton to achieve production-scale performance requirements
  • Collaborate with DevOps teams on MLOps infrastructure including model monitoring, load testing, caching optimization, and automated CI/CD pipelines for model deployments
  • Contribute to privacy-preserving ML approaches and on-device model implementations that align with Brave's privacy-first mission

Requirements

Machine Learning, Pipeline Development, Tgi, Computer Science, Kubernetes, Scratch, Production Experience, Load Testing, Performance Tuning, * 2 to 5 years of experience optimizing and deploying ML models in production environments

  • Strong software engineering background with production experience
  • Extensive experience with PyTorch or other modern ML frameworks
  • Experience training custom models from scratch
  • Experience with model optimization and inference frameworks (e.g., vLLM, ONNX Runtime, Nvidia Triton)
  • Familiarity with MLOps practices & Kubernetes and ability to collaborate with DevOps teams on model monitoring, load testing, and CI/CD pipelines
  • Experience shipping ML-powered features or systems (consumer applications preferred), * Master's degree in Computer Science, Machine Learning, or related field
  • Familiarity with LLM serving frameworks (vLLM, TGI, Ray Serve) and GPU optimization
  • Experience with embeddings, vector databases, semantic search implementations, model training workflows, and data pipeline development
  • Experience integrating LLMs with tool calling/MCP
  • Knowledge of privacy-preserving ML techniques and on-device model deployment
  • Previous work on cost optimization and performance tuning of ML systems at scale

Apply for this position