AI Engineer - Reinforcement Learning

BLUE SERVICE

Paris, France

3 months ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Paris, France

Tech stack

Artificial Intelligence

Learning Management Systems

Python

Reinforcement Learning

PyTorch

Large Language Models

Machine Learning Operations

Data Pipelines

Automation Anywhere

Job description

The AI Studio's mission is to find the fastest possible path to an autonomous supply chain. We're developing AI agents, learning systems, training models, and more to overcome the biggest challenges remaining in the global supply chain. In short, we are having a lot of fun. Your mission in this role We're looking for an ambitious AI Engineer specialising in Reinforcement Learning to work on environments, evaluations, data pipelines, and tooling for robust training systems. You'll help shape how we approach reward modeling, environment design, and agent training. If you're energised by pushing the boundaries of what's possible, this is your chance. Responsibilities:

Design and implement RL environments for supply chain decision-making
Develop reward functions that capture what "good" looks like for our agents
Create evaluation frameworks to measure agent performance and catch failure modes
Build data pipelines for training and human feedback collection
Document what works (and what doesn't) so we can compound our learnings
Stay on top of industry trends and cutting edge use cases

Requirements

You've trained or fine-tuned LLMs
Are excited about AI-assisted tools and getting the most out of them
Build & customize your own AI workflows
Have experience working with AI agents and RL environments in production
Are proficient in Python and PyTorch
Can balance research exploration with shipping working code
Hands on experience with RL techniques (reward shaping, policy optimization, RLHF)
Thrive in fast-moving environments where priorities shift
Care about craft in your work
Are curious about why things work, not just that they work

Bonus points if:

You have experience with human-in-the-loop ML systems
You've built evaluation frameworks for open-ended tasks
You're familiar with supply chain, logistics, or operations domains
You have a side project that shows you can't stop tinkering

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all