Senior/Staff Backend Engineer - Distributed System

VANHACK TECHNOLOGIES INC.
Palo Alto, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
Palo Alto, United States of America

Tech stack

API
Artificial Intelligence
Program Optimization
Software Quality
Cursor (Graphical User Interface Elements)
Linux
Distributed Systems
Github
Python
Performance Tuning
Backend
Usage Tracking
Containerization
Kubernetes
GraphQL
Front End Software Development
Api Design
Scheduling Algorithms
Docker
Crud

Job description

Our hiring partner is looking for a Backend Engineer to build the systems that orchestrate GPU clusters for AI workloads. You'll develop APIs that manage GPU allocation, memory, compute scheduling, and multi-tenant isolation-challenges unique to AI infrastructure that go far beyond standard backend engineering. On their backend team, you'll tackle questions like: How can high-cost GPU resources be efficiently shared among users? How do we handle memory constraints for large AI models? How do we maintain quality of service when workloads compete for compute? This is an opportunity to build infrastructure where every API call could allocate thousands of dollars of compute per hour, and where your optimizations directly influence whether AI startups can train their models affordably.

What you'll do

  • Design APIs that simplify complex GPU operations for developers
  • Build scheduling algorithms that maximize GPU utilization while meeting SLAs
  • Develop systems to manage the full GPU lifecycle: provisioning, allocation, scheduling, and release
  • Implement usage tracking and billing for GPU-hours, memory, and compute utilization
  • Create monitoring solutions for GPU-specific metrics, health checks, and automated recovery
  • Build multi-tenant systems with resource isolation, quota management, and fair scheduling
  • Optimize cold starts for model serving and efficient model loading
  • Collaborate with frontend engineers to expose complex infrastructure through intuitive interfaces
  • Leverage AI-assisted coding tools (GitHub Copilot, Claude Code, Cursor IDE, etc.) to enhance productivity and code quality

Requirements

  • Have 5+ years of backend engineering experience in distributed systems
  • Are proficient in Go, Python, or similar backend languages
  • Have experience with resource scheduling, orchestration, and API design (REST, GraphQL, gRPC)
  • Understand hardware constraints and system optimization
  • Have Linux systems and containerization experience (Docker, Kubernetes)
  • Are comfortable working with expensive resources where efficiency impacts costs
  • Are excited to solve novel problems in AI infrastructure (beyond CRUD apps)
  • Bring a startup mindset-comfortable with ambiguity and rapid iteration

Bonus qualifications

  • Experience with GPU or HPC cluster management
  • Familiarity with ML/AI workload patterns and requirements
  • Experience with high-value resource allocation systems
  • Background in performance optimization for compute-intensive workloads
  • Knowledge of GPU virtualization and sharing technologies
  • Experience building billing or metering systems, * Hybrid role: 3 days in office, 2 days WFH; must be located in Palo Alto
  • Applicants must be authorized to work in the United States without visa sponsorship

Benefits & conditions

  • They offer competitive salary and equity based on experience and skillset

About the company

Our hiring partner is on a mission to make AI compute ubiquitous, seamless, and limitless. They're building a cloud where AI just works-anywhere, anytime. "AI Power. Everywhere." Join the team designing the infrastructure for an AI-first world.

Apply for this position