AI agent Infrastructure Engineers
Mercor
3 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Compensation
£ 72KJob location
Tech stack
API
Artificial Intelligence
Amazon Web Services (AWS)
Azure
C++
Cloud Computing
Distributed Systems
Fault Tolerance
Python
Software Engineering
Data Streaming
Reinforcement Learning
Large Language Models
Multi-Agent Systems
Caching
Reliability of Systems
Containerization
Kubernetes
Information Technology
Docker
Network Optimization
Microservices
Job description
- Design, build, and optimize infrastructure for training, deploying, and scaling AI agents across distributed systems.
- Develop robust backend services, APIs, and orchestration frameworks that support multi-agent workflows and high-performance compute environments.
- Collaborate closely with research and product teams to integrate model-serving pipelines, memory systems, and reasoning components.
- Implement monitoring, observability, and failover mechanisms to ensure high system reliability and fault tolerance.
- Evaluate and refine infrastructure performance, identifying bottlenecks and improving efficiency across data, compute, and model layers.
- Participate in synchronous collaboration sessions (4-hour windows, 2-3 times per week) to review architecture decisions, troubleshoot distributed systems, and iterate on design improvements.
Requirements
- Strong background in Computer Science, Software Engineering, or Systems Design, with focus on large-scale distributed infrastructure.
- Experience with cloud computing (AWS, GCP, or Azure) and containerization/orchestration tools such as Docker and Kubernetes.
- Proficiency in backend programming languages such as Go, Rust, Python, or C++.
- Familiarity with LLM inference pipelines, multi-agent architectures, or reinforcement learning environments is a strong plus.
- Knowledge of network optimization, data streaming, and caching architectures preferred.
- Excellent collaboration and communication skills.
- Ability to commit 20-30 hours per week, including required synchronous collaboration sessions.
About the company
Why Join
* Work directly with a world-class AI research lab building the infrastructure behind tomorrow's intelligent agent ecosystems.
* Influence the foundations of AI scalability, reliability, and deployment, enabling complex agents to operate in real-world environments.
* Enjoy schedule flexibility - select your own 4-hour collaboration windows and manage your 20-30 hour work week.
* Be engaged as an hourly contractor through Mercor, giving you autonomy while contributing to mission-critical AI infrastructure projects.
* Collaborate with top systems engineers, researchers, and AI developers working at the intersection of distributed systems and advanced intelligence.
* Join a global network of technical experts shaping how the next generation of AI agents reason, interact, and evolve at scale.