
Erik Bamberg
Nov 17, 2023
What comes after ChatGPT? Vector Databases - the Simple and powerful future of ML?

#1about 3 minutes
Understanding the limitations of large language models
Large language models like ChatGPT face challenges with token limits and incorporating private data, which restricts their use on large documents or custom knowledge bases.
#2about 3 minutes
Why vector databases are attracting major investment
Unlike relational or NoSQL databases, vector databases are designed to store and semantically search unstructured data, filling a critical gap in the data landscape.
#3about 4 minutes
The challenge of searching unstructured data
Manually tagging unstructured data like images and documents is inconsistent and subjective, making it an inefficient way to enable search.
#4about 5 minutes
How vector embeddings capture semantic meaning
Machine learning models convert unstructured data into numerical representations called embeddings, where semantically similar items are positioned closely in a high-dimensional space.
#5about 5 minutes
Visualizing relationships in a vector space
A demonstration with Google's Projector TensorFlow shows how words like "king" and "queen" are clustered together, visually representing their semantic proximity.
#6about 6 minutes
Performing fast similarity search with vectors
Vector databases use mathematical formulas to measure the distance between embeddings and employ indexing techniques like Approximate Nearest Neighbor (ANN) for high-speed search.
#7about 4 minutes
An overview of the vector database market
A look at popular vector databases like Pinecone, Weaviate, and Milvus, including their features, hosting models, and integrations with platforms like Hugging Face.
#8about 4 minutes
Building applications like intrusion and face detection
Vector databases can power real-world applications such as intrusion detection systems and face similarity matching without needing constant model retraining.
#9about 6 minutes
Augmenting ChatGPT with a long-term memory
The Retrieval-Augmented Generation (RAG) pattern uses a vector database to find relevant data chunks, providing LLMs with the right context to answer questions accurately.
#10about 16 minutes
Exploring more applications for vector search
Vector search enables a wide range of applications including recommendation systems, document deduplication, time-series analysis, and advanced product search.
Related jobs
Jobs that call for the skills explored in this talk.
yesterday
Machine Learning Engineer

Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
5 days ago
Senior Machine Learning Engineer (f/m/d)

MARKT-PILOT GmbH
Stuttgart, Germany
Remote
Senior
1 month ago
(Senior) Experte (w/m/d) Data & KI

Raven51 AG
Karlsruhe, Germany
Senior