Making Data Warehouses fast. A developer's story.

A developer went from 1.5-second BigQuery responses to under 500ms with 100 concurrent users. Here's the open-source tool they used to do it.

#1about 3 minutes

High latency in applications built on data warehouses creates a poor user experience and presents a significant challenge for developers.

#2about 5 minutes

Data warehouses use OLAP for complex, low-volume queries on large datasets, contrasting with OLTP's high-volume, simple transactions.

#3about 3 minutes

User-perceived performance is impacted by network delays and data scan times, making sub-second responses a critical goal.

#4about 7 minutes

BigQuery's cache only works for identical queries and its concurrency is capped per project, impacting real-world application performance.

#5about 4 minutes

Load testing reveals that BigQuery maintains a consistent query latency of around two seconds regardless of user concurrency up to its hard limit.

#6about 2 minutes

Cube provides a semantic layer over data warehouses, enabling caching, pre-aggregations, and access control to build fast data apps.

#7about 3 minutes

A local Cube instance can be configured using Docker Compose to connect to BigQuery and automatically generate data schemas.

#8about 5 minutes

Pre-aggregations act as materialized views that store condensed query results, reducing a query's response time from seconds to milliseconds.

#9about 3 minutes

Benchmarks show that using Cube's pre-aggregation layer results in a nearly five-fold performance increase over querying BigQuery directly.

#10about 8 minutes

The discussion covers when to implement a caching layer, how Cube improves performance, and its utility for medium-sized databases.

Matching moments