Max Hausner & Yves Fauser

gRPC Load Balancing Deep Dive

Is your gRPC load balancer creating server hotspots? Learn why long-lived connections undermine autoscaling and how a simple Kubernetes configuration can solve it.

gRPC Load Balancing Deep Dive
#1about 4 minutes

An overview of gRPC fundamentals and its trade-offs

gRPC is a high-performance framework using Protobuf for efficiency, but it has limitations in browser support and tooling maturity compared to REST.

#2about 4 minutes

How gRPC streaming and HTTP/2 affect load balancing

gRPC supports various streaming patterns over persistent HTTP/2 connections, which can cause traffic hotspots with traditional Layer 4 load balancing.

#3about 3 minutes

Client-side versus infrastructure-based load balancing strategies

Choose client-side load balancing for low-latency internal services and infrastructure-based load balancing for external APIs that require a clear demarcation point.

#4about 7 minutes

Exploring different types of load balancing algorithms

A review of basic, load-based, and hash-based algorithms reveals that options like "least outstanding requests" can outperform simple round robin for uneven loads.

#5about 2 minutes

Why autoscaling gRPC services can be challenging

Long-lived streaming connections can prevent traffic from being distributed to newly scaled instances, making traditional CPU-based autoscaling ineffective.

#6about 4 minutes

Tools for functional and performance testing of gRPC

Use tools like grpcurl for functional API testing with proto files and ghz for comprehensive performance and load testing of your gRPC services.

#7about 3 minutes

Case study: Separating unary and streaming calls

A practical example shows how separating unary and streaming gRPC calls into different Kubernetes services and target groups solves uneven load distribution.

#8about 1 minute

Key takeaways for effective gRPC load balancing

Successfully load balance gRPC by being mindful of long-lived sessions, understanding client traffic patterns, and selecting L7-based algorithms when possible.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
All the videos of Halfstack London 2024!
Last month was Halfstack London, a conference about the web, JavaScript and half a dozen other things. We were there to deliver a talk, but also to record all the sessions and we're happy to share them with you. It took a bit as we had to wait for th...
All the videos of Halfstack London 2024!
CH
Chris Heilmann
Dev Digest 139 - Soft and hard queries
News and ArticlesLet's start with Amelia Wattenberger's excellent essay Bridging the hard and the soft talking about humans and data. Do you sometimes also miss aspects of the early web like a lack of big data and freedom to create without numeric go...
Dev Digest 139 - Soft and hard queries

From learning to earning

Jobs that call for the skills explored in this talk.

Rust and GoLang

Rust and GoLang

NHe4a GmbH
Karlsruhe, Germany

Remote
55-65K
Intermediate
Senior
Go
Rust
SR Backend Python/GCP

SR Backend Python/GCP

Medium
Municipality of Madrid, Spain

API
Python
PostgreSQL
Kubernetes
Continuous Integration