Applied Scientist - LLM, Alexa Conversational Modelling Intelligence

Amazon.com, Inc

Berlin, Germany

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Berlin, Germany

Tech stack

Java

JavaScript

Artificial Intelligence

Amazon Web Services (AWS)

C Sharp (Programming Language)

Data Files

Distributed Computing Environment

Perl

Python

Machine Learning

Ruby

Software Engineering

Reinforcement Learning

Data Processing

Large Language Models

Information Technology

Programming Languages

Job description

As an Applied Scientist II in the Alexa Conversational Modelling Intelligence team within Alexa AI, you will drive model post-training for Large Language Models that power Alexa+. You'll adopt and adapt state-of-the-art techniques - including supervised fine-tuning, reinforcement learning, preference optimization, and knowledge distillation - running rigorous experiments and translating findings into production-ready solutions that directly improve the customer experience for millions of users worldwide.

You will own the full model development cycle from data curation through training, evaluation, and deployment. Your day-to-day will involve developing evaluation methods and metrics, diagnosing model defects, optimizing model training pipelines, and iterating on recipes to move concrete quality and efficiency benchmarks. You'll write clean, reproducible code, contribute to shared tooling, and collaborate closely with scientists and engineers to bring models from experimentation to scale.

You are technically curious, experiment-driven, and motivated by real customer impact. You are an expert in LLM post-training. You will also advance the state of the art by publishing at top-tier NLP/ML conferences (ACL, EMNLP, NeurIPS, ICML, ICLR) - contributing to the broader research community while grounding your work in measurable outcomes.

Key job responsibilities

Own the full model development cycle - from data curation through training, evaluation, and deployment.
Develop and apply post-training techniques: supervised fine-tuning, reinforcement learning, preference optimization, and knowledge distillation.
Build evaluation methods and metrics, and diagnose model defects to target the highest-impact improvements.
Optimize model training pipelines and iterate on recipes to move concrete quality and efficiency benchmarks.
Write high-quality documentation on methods and experiment outcomes, and communicate findings clearly to stakeholders.

A day in the life Post-training is one of the most active frontiers in LLMs right now. The field has moved from scaling pretraining to getting more out of models afterward through RL, reasoning recipes, and preference optimization. You'll work on these techniques directly, on a product used by millions of customers every day. A typical day: review overnight training runs and dashboards, dig into model defects to form hypotheses, then curate data and iterate on a recipe, improving shared tooling along the way. You'll sync with scientists and engineers to unblock the path to production, and write up your findings for stakeholders. It's fast-moving - a good idea can reach millions of customers within weeks.

About the team The Alexa Conversational Modelling Intelligence team builds industry-leading LLM-based conversational technologies that customers love. Our mission is to push the envelope in LLMs for Alexa to deliver the best-possible customer experience. As an Applied Scientist, you'll contribute directly to that mission through model development and experimentation.

Requirements

PhD in computer science, machine learning, engineering, or related fields

Knowledge of at least one programming language such as Java, C#, JavaScript, Python, Ruby or Perl
Experience in designing experiments and statistical analysis of results
Hands-on experience building, training, and evaluating LLMs.

Preferred Qualifications

Have publications on top-tier conferences, such as CVPR, ICCV, ECCV or NeurIPS
Experience working with large, complex data sets
Experience working effectively with science, data processing, and software engineering teams
Experience in written and verbal communication skills to communicate with technical and non-technical audiences, including senior leadership
Experience building and deploying LLM solutions in production or at scale.
Hands-on experience with Large Language Models training and fine-tuning via pre-training, SFT, and/or RLHF/preference optimization.
Experience with LLM evaluation - building benchmarks, LLM-as-a-judge, or defect/quality analysis.
Familiarity with modern training/inference infrastructure (e.g., distributed training, RL frameworks, model serving).

About the company

Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice (https://www.amazon.jobs/en/privacy_page) to know more about how we collect, use and transfer the personal data of our candidates.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all