Staff Software Engineer, Gemini App, Horizontal Quality, DeepMind

Google LLC

Mountain View, United States of America

7 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Compensation

$ 300K

Job location

Mountain View, United States of America

Tech stack

Bioinformatics

Python

Machine Learning

Software Engineering

SQL Databases

Large Language Models

Model Validation

Generative AI

Information Technology

Machine Learning Operations

Data Pipelines

Job description

Design, build, and maintain a highly scalable evaluation framework specifically tailored to measure the product-level quality of the Gemini App, moving beyond standard model-level benchmarks.
Create, synthesize, and refine meaningful metrics that accurately capture the user experience.
Be responsible for developing a holistic view of quality by combining newly engineered online metrics with deep offline evaluation data.
Build a transparent, definitive ranking system for product-level quality. Use this system to benchmark the Gemini App against industry standards and clearly identify our competitive strengths and weaknesses.
Act as a critical partner to Product Management and leadership by translating complex evaluation data into clear, strategic signals. Offer technical leadership on high-impact projects.

Information collected and processed as part of your Google Careers profile, and any job applications you choose to submit is subject to Google'sApplicant and Candidate Privacy Policy (./privacy-policy) .

Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law. See alsoGoogle's EEO Policy (https://www.google.com/about/careers/applications/eeo/) ,Know your rights: workplace discrimination is illegal (https://careers.google.com/jobs/dist/legal/EEOC_KnowYourRights_10_20.pdf) ,Belonging at Google (https://about.google/belonging/) , andHow we hire (https://careers.google.com/how-we-hire/) .

If you have a need that requires accommodation, please let us know by completing ourAccommodations for Applicants form (https://goo.gl/forms/aBt6Pu71i1kzpLHe2) .

Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.

To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.

Requirements

Bachelor's degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience leading technical strategy and architecting large-scale ML infrastructure (e.g., designing serving layers, model evaluation frameworks, or data processing pipelines).
5 years of experience testing, and launching software products.
3 years of experience with Generative AI, Large Language Models (LLMs), Machine Learning, and related frameworks., * Master's degree or PhD in Engineering, Computer Science, or a related technical field.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Experience in building and scaling evaluation pipelines (e.g., RLHF, auto-evals, or side-by-side human evaluations) to measure helpfulness and accuracy.
Proficiency in advanced prompting techniques and understanding how model fine-tuning, RL, or RAG impacts final response quality.
Ability to use SQL, Python, or internal data tools to analyze user behavioral data and "pain points" to identify where the model is failing.

About the company

Our mission is to elevate the Gemini experience by perfectly aligning foundational model behaviors with high-quality data. We drive conversational excellence through thoughtful persona shaping, robust safety enforcement, and clear information architecture. By combining these efforts with rich online and offline signals, we deliver a product that is highly performant and effortlessly intuitive. Artificial intelligence will be one of humanity's most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority. We are pushing the boundaries across multiple domains. Our global teams offer different learning opportunities and varied career pathways for those driven to achieve exceptional results through collective effort. The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more aboutbenefits at Google (https://careers.google.com/benefits/) .