AI agent Infrastructure Engineers

Mercor
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
£ 72K

Job location

Tech stack

API
Artificial Intelligence
Amazon Web Services (AWS)
Azure
C++
Cloud Computing
Distributed Systems
Fault Tolerance
Python
Software Engineering
Data Streaming
Reinforcement Learning
Large Language Models
Multi-Agent Systems
Caching
Reliability of Systems
Containerization
Kubernetes
Information Technology
Docker
Network Optimization
Microservices

Job description

  • Design, build, and optimize infrastructure for training, deploying, and scaling AI agents across distributed systems.
  • Develop robust backend services, APIs, and orchestration frameworks that support multi-agent workflows and high-performance compute environments.
  • Collaborate closely with research and product teams to integrate model-serving pipelines, memory systems, and reasoning components.
  • Implement monitoring, observability, and failover mechanisms to ensure high system reliability and fault tolerance.
  • Evaluate and refine infrastructure performance, identifying bottlenecks and improving efficiency across data, compute, and model layers.
  • Participate in synchronous collaboration sessions (4-hour windows, 2-3 times per week) to review architecture decisions, troubleshoot distributed systems, and iterate on design improvements.

Requirements

  • Strong background in Computer Science, Software Engineering, or Systems Design, with focus on large-scale distributed infrastructure.
  • Experience with cloud computing (AWS, GCP, or Azure) and containerization/orchestration tools such as Docker and Kubernetes.
  • Proficiency in backend programming languages such as Go, Rust, Python, or C++.
  • Familiarity with LLM inference pipelines, multi-agent architectures, or reinforcement learning environments is a strong plus.
  • Knowledge of network optimization, data streaming, and caching architectures preferred.
  • Excellent collaboration and communication skills.
  • Ability to commit 20-30 hours per week, including required synchronous collaboration sessions.

About the company

Why Join * Work directly with a world-class AI research lab building the infrastructure behind tomorrow's intelligent agent ecosystems. * Influence the foundations of AI scalability, reliability, and deployment, enabling complex agents to operate in real-world environments. * Enjoy schedule flexibility - select your own 4-hour collaboration windows and manage your 20-30 hour work week. * Be engaged as an hourly contractor through Mercor, giving you autonomy while contributing to mission-critical AI infrastructure projects. * Collaborate with top systems engineers, researchers, and AI developers working at the intersection of distributed systems and advanced intelligence. * Join a global network of technical experts shaping how the next generation of AI agents reason, interact, and evolve at scale.

Apply for this position