Senior/Staff Backend Engineer - Distributed System

VANHACK TECHNOLOGIES INC.

Palo Alto, United States of America

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Palo Alto, United States of America

Tech stack

API

Artificial Intelligence

Program Optimization

Software Quality

Cursor (Graphical User Interface Elements)

Linux

Distributed Systems

Github

Python

Performance Tuning

Backend

Usage Tracking

Containerization

Kubernetes

GraphQL

Front End Software Development

Api Design

Scheduling Algorithms

Docker

Crud

Job description

Our hiring partner is looking for a Backend Engineer to build the systems that orchestrate GPU clusters for AI workloads. You'll develop APIs that manage GPU allocation, memory, compute scheduling, and multi-tenant isolation-challenges unique to AI infrastructure that go far beyond standard backend engineering. On their backend team, you'll tackle questions like: How can high-cost GPU resources be efficiently shared among users? How do we handle memory constraints for large AI models? How do we maintain quality of service when workloads compete for compute? This is an opportunity to build infrastructure where every API call could allocate thousands of dollars of compute per hour, and where your optimizations directly influence whether AI startups can train their models affordably.

What you'll do

Design APIs that simplify complex GPU operations for developers
Build scheduling algorithms that maximize GPU utilization while meeting SLAs
Develop systems to manage the full GPU lifecycle: provisioning, allocation, scheduling, and release
Implement usage tracking and billing for GPU-hours, memory, and compute utilization
Create monitoring solutions for GPU-specific metrics, health checks, and automated recovery
Build multi-tenant systems with resource isolation, quota management, and fair scheduling
Optimize cold starts for model serving and efficient model loading
Collaborate with frontend engineers to expose complex infrastructure through intuitive interfaces
Leverage AI-assisted coding tools (GitHub Copilot, Claude Code, Cursor IDE, etc.) to enhance productivity and code quality

Requirements

Have 5+ years of backend engineering experience in distributed systems
Are proficient in Go, Python, or similar backend languages
Have experience with resource scheduling, orchestration, and API design (REST, GraphQL, gRPC)
Understand hardware constraints and system optimization
Have Linux systems and containerization experience (Docker, Kubernetes)
Are comfortable working with expensive resources where efficiency impacts costs
Are excited to solve novel problems in AI infrastructure (beyond CRUD apps)
Bring a startup mindset-comfortable with ambiguity and rapid iteration

Bonus qualifications

Experience with GPU or HPC cluster management
Familiarity with ML/AI workload patterns and requirements
Experience with high-value resource allocation systems
Background in performance optimization for compute-intensive workloads
Knowledge of GPU virtualization and sharing technologies
Experience building billing or metering systems, * Hybrid role: 3 days in office, 2 days WFH; must be located in Palo Alto
Applicants must be authorized to work in the United States without visa sponsorship

Benefits & conditions

They offer competitive salary and equity based on experience and skillset

About the company

Our hiring partner is on a mission to make AI compute ubiquitous, seamless, and limitless. They're building a cloud where AI just works-anywhere, anytime. "AI Power. Everywhere." Join the team designing the infrastructure for an AI-first world.