Deep Reinforcement Learning Engineer (Principal)
Friday Systems
Municipality of Madrid, Spain
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Municipality of Madrid, Spain
Tech stack
Artificial Intelligence
Algorithm Design
Amazon Web Services (AWS)
Cloud Computing
Profiling
Software Quality
Linux
Python
Reinforcement Learning
PyTorch
Deep Learning
FastAPI
Docker
Job description
Friday Systems builds AI that allows industrial robots to adapt to dynamic warehouse environments. We focus on high-throughput palletizing and related tasks where classical approaches break down. Our stack is built around Deep Reinforcement Learning with modern sequence models., Own the DRL stack end-to-end : formulation * algorithm design * large-scale training * evaluation * deployment. You'll work directly with the CTO to turn cutting-edge DRL into production throughput at customer sites. YOU WILL
- Design & ship DRL algorithms (PPO / SAC / DDQN and variants, based on encoders / cross-attention / pointer networks) for complex control & combinatorial optimization.
- Tackle stability & sample-efficiency : GAE, normalization, entropy / KL control, distributional / value-loss tuning, curriculum learning and reward shaping, …
- Launch multi-GPU training, parallel rollouts, efficient replay / storage, and reproducible experiment tooling.
- Productionize : clean PyTorch code, profiling, Dockerized services (FastAPI), AWS deployments, experiment tracking, dashboards.
- Collaborate with the C-Level Team to ensure product excellence and alignment with business strategy. Forge strong relationships with clients, effectively translating their needs into unique technology solutions.
- Build and nurture a high-performing team by attracting top talent. Provide mentorship and leadership to foster a culture of quality and innovation.
Requirements
- Track record shipping RL beyond academic demos : you've led at least one end-to-end RL system from idea to production or a state-of-the-art benchmark in the last 3-5 years.
- Extensive Deep Learning, Reinforcement Learning & PyTorch expertise : You can implement several DRL algorithms from scratch, reason about root-cause performance drops and make informed decisions about next steps.
- Systems know-how : Python, Linux, Docker, Multi-GPU, Cloud (AWS).
- Math maturity : MDPs / Bellman operators, policy gradients, trust-region / KL, GAE / λ-returns, stability / regularization in on-policy vs off-policy regimes.
- Ownership : you're comfortable being the primary owner for experiments, code quality, and results in a small team.
- Location / time zone : EU-based (CET±2) and able to travel occasionally to customer warehouses.