Software Engineer, Machine Learning Services (MLS)

UiPath
Kilsby, United Kingdom
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Kilsby, United Kingdom

Tech stack

Abstraction Layers
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Azure
C++
Cloud Storage
Nvidia CUDA
Serialization
Data Structures
Distributed Systems
Python
Machine Learning
Message Broker
Performance Tuning
Queueing Systems
Multithreading
Concurrency
Gpu Programming
Containerization
Kubernetes
Information Technology
Machine Learning Operations
Asynchronous Programming
Api Gateway
Docker

Job description

  • Design, build, and operate the core MLS platform. This includes our Rust-based API gateway, Python ML compute workers, and the distributed job queue that orchestrates it all.
  • Solve hard concurrency, performance, and distributed systems problems to ensure our platform is bulletproof for high-volume production workloads.
  • Work directly with product and ML science teams to understand their needs and build the scalable infrastructure required to bring their models to life-from massive GenAI models to fine-tuned, specialized classifiers.
  • Develop our custom-built, content-addressable storage abstraction layer over cloud object stores (GCS, S3, Azure Blob), complete with its own garbage collection and sharding logic.
  • Enhance our asynchronous job-queueing system, built from the ground up on the storage layer using compare-and-swap primitives for atomicity. No off-the-shelf message broker could handle our specific needs.
  • Dive deep into the entire stack, from Kubernetes and container orchestration, through gRPC-based service communication, to the performance tuning of ONNX-based inference on GPU-accelerated hardware.
  • Write clean, efficient, and rigorously tested code. We value simplicity, correctness, and peer review.

Requirements

  • A solid track record (5+ years) of engineering and architecting large-scale, distributed commercial services. Your experience speaks for itself.
  • Deep proficiency in a systems-level language (Rust, C++, Go). A willingness and curiosity to become an expert in Rust is essential, as it's the foundation of our core services. Strong Python skills are also critical.
  • Real-world experience with cloud ecosystems (Azure, AWS, or GCP) and containerization (Docker, Kubernetes). You should understand how production systems are deployed, monitored, and scaled.
  • A firm grasp of concurrency, multithreading, and asynchronous programming. You know the difference between a mutex and a channel, and you know when (and when not) to use them.
  • A pragmatic understanding of computer science fundamentals. We care more about your ability to solve real-world problems with data structures and algorithms than your ability to recite them from a textbook.
  • An opinion on what makes good code and good architecture, and the ability to articulate it. You should be comfortable challenging assumptions (including our own) and contributing to a culture of continuous improvement.
  • You're a builder and a problem-solver at heart.

Nice to Haves (but we can teach you):

  • You've already worked with Rust in a production environment.
  • Experience with MLOps, particularly the challenges of managing the lifecycle of models in a multi-tenant, high-availability system.
  • Familiarity with building ML inference services, model serialization (e.g., ONNX), and GPU programming (CUDA).
  • You've built or worked on custom storage or job-queueing systems before and have the scars to prove it.

About the company

 

Shaping the future of agentic automation

As the world moves into an agentic future—the UiPath Platform™ enables AI agents, robots, people, and models to work together harmoniously to revolutionize industries and enhance human potential. 

Apply for this position