Philipp Krenn
Make Your Data FABulous
#1about 7 minutes
Understanding the CAP theorem for distributed systems
The CAP theorem states that a distributed data store can only provide two of three guarantees: consistency, availability, and partition tolerance.
#2about 3 minutes
Introducing the FAB theory for datastore tradeoffs
The FAB theory proposes another set of tradeoffs for data stores, where you can only pick two of three attributes: fast, accurate, or big.
#3about 7 minutes
How terms aggregation trades accuracy for speed
Elasticsearch's terms aggregation may return inaccurate counts by default because each shard only considers its top local results to improve performance.
#4about 8 minutes
Inconsistent relevance scores in distributed full-text search
Full-text search relevance scores using TF-IDF can be inconsistent because inverse document frequency is calculated per-shard, not globally.
#5about 2 minutes
Using a single shard to ensure data accuracy
Forcing an index to use a single shard guarantees accurate aggregations and relevance scores by eliminating distributed calculations, but sacrifices horizontal scaling.
#6about 1 minute
Why you must consciously choose your data tradeoffs
It is crucial to understand and explicitly choose the tradeoffs in your data systems, like those in the CAP and FAB theorems, to avoid unexpected behavior.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
30:35 MIN
Navigating the challenges of distributed aggregations
Distributed search under the hood
25:24 MIN
Q&A on indexing, aggregations, and OpenSearch vs Elasticsearch
Search and aggregations made easy with OpenSearch and NodeJS
12:14 MIN
Introducing the core principles of Elasticsearch
Distributed search under the hood
25:26 MIN
Modern data architectures and the reality of team size
Modern Data Architectures need Software Engineering
21:37 MIN
Distributing data using shards and replicas
Distributed search under the hood
34:35 MIN
Achieving massive throughput with sharded architectures
The Rise of Reactive Microservices
34:48 MIN
Q&A on performance, parallelism, and organizational impact
Convert batch code into streaming with Python
44:31 MIN
Q&A on GraphQL, team structure, and vendor software
Building high performance and scalable architectures for enterprises
Featured Partners
Related Videos
Distributed search under the hood
Alexander Reelsen
Things I learned while writing high-performance JavaScript applications
Michele Riva
Database Magic behind 40 Million operations/s
Jürgen Pilz
Modern Data Architectures need Software Engineering
Matthias Niehoff
How building an industry DBMS differs from building a research one
Markus Dreseler
Leveraging Real time data in FSIs
Tim Faulkes
Empowering Retail Through Applied Machine Learning
Christoph Fassbach & Daniel Rohr
Writing a full-text search engine in TypeScript
Michele Riva
From learning to earning
Jobs that call for the skills explored in this talk.

DevOps Engineer – Kubernetes & Cloud (m/w/d)
epostbox epb GmbH
Berlin, Germany
Intermediate
Senior
DevOps
Kubernetes
Cloud (AWS/Google/Azure)


Full Stack Engineer
Climax.eco
Rotterdam, Netherlands
€70-100K
Senior
TypeScript
PostgreSQL
Cloud (AWS/Google/Azure)

![Senior Software Engineer [TypeScript] (Prisma Postgres)](https://wearedevelopers.imgix.net/company/283ba9dbbab3649de02b9b49e6284fd9/cover/oKWz2s90Z218LE8pFthP.png?w=400&ar=3.55&fit=crop&crop=entropy&auto=compress,format)
Senior Software Engineer [TypeScript] (Prisma Postgres)
Prisma
Remote
Senior
Node.js
TypeScript
PostgreSQL

Domain Architect Ricardo Platform (f/m/d) | 80-100% | Hybrid working model | Valbonne France
SMG Swiss Marketplace Group
Canton de Valbonne, France
Senior

Domain Architect Ricardo Platform (f/m/d) | 80-100% | Hybrid working model | Zürich Switzerland
SMG Swiss Marketplace Group
Sachseln, Switzerland
Senior

Senior DevOps Engineer (f/m/x)
Douglas GmbH
Düsseldorf, Germany
Senior
Kubernetes
Cloud (AWS/Google/Azure)
