Software Engineer, Applied ML - 2025 New

Brave Software

Charing Cross, United Kingdom

23 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Remote

Charing Cross, United Kingdom

Tech stack

Training Data

Artificial Intelligence

Databases

DevOps

Web Browsers

Load Testing

Machine Learning

Performance Tuning

Software Engineering

PyTorch

Large Language Models

Kubernetes

Information Technology

Machine Learning Operations

Speech Synthesis

Data Pipelines

Job description

Join Brave's mission to revolutionize web browsing through AI. We're looking for an experienced ML Engineer to build next-generation features that serve nearly 100 million users worldwide. You'll work with state-of-the-art language models, collaborating across teams to ship innovative AI capabilities that make the browser smarter and more capable-all while maintaining our privacy-first principles., * Evaluate, integrate, and deploy state-of-the-art language models for Leo and other browser AI capabilities, including both cloud-based and on-device deployment scenarios

Design, optimize, and maintain ML inference pipelines for browser-integrated AI features, with focus on reducing deployment costs and improving model performance
Develop and train custom ML models for browser-specific use cases such as content classification and search optimization using techniques like LoRA and DPO, including distributed training setups
Generate synthetic data for training data augmentation and model evaluation
Collaborate with browser engineering teams to seamlessly integrate AI capabilities into core product features while maintaining performance and privacy standards
Collaborate with product and design teams to define, prototype, and ship new AI-powered features including text-to-speech, image generation, and enhanced tool calling capabilities
Implement and optimize model serving infrastructure using frameworks like vLLM, ONNX Runtime, and Nvidia Triton to achieve production-scale performance requirements
Collaborate with DevOps teams on MLOps infrastructure including model monitoring, load testing, caching optimization, and automated CI/CD pipelines for model deployments
Contribute to privacy-preserving ML approaches and on-device model implementations that align with Brave's privacy-first mission

Requirements

Machine Learning, Pipeline Development, Tgi, Computer Science, Kubernetes, Scratch, Production Experience, Load Testing, Performance Tuning, * 2 to 5 years of experience optimizing and deploying ML models in production environments

Strong software engineering background with production experience
Extensive experience with PyTorch or other modern ML frameworks
Experience training custom models from scratch
Experience with model optimization and inference frameworks (e.g., vLLM, ONNX Runtime, Nvidia Triton)
Familiarity with MLOps practices & Kubernetes and ability to collaborate with DevOps teams on model monitoring, load testing, and CI/CD pipelines
Experience shipping ML-powered features or systems (consumer applications preferred), * Master's degree in Computer Science, Machine Learning, or related field
Familiarity with LLM serving frameworks (vLLM, TGI, Ray Serve) and GPU optimization
Experience with embeddings, vector databases, semantic search implementations, model training workflows, and data pipeline development
Experience integrating LLMs with tool calling/MCP
Knowledge of privacy-preserving ML techniques and on-device model deployment
Previous work on cost optimization and performance tuning of ML systems at scale