How We Built a Machine Learning-Based Recommendation System (And Survived to Tell the Tale)

How do you find the perfect substitute for an out-of-stock item? Learn how we adapted a natural language model to solve this critical e-commerce challenge.

#1about 5 minutes

Defining the business need for product recommendations

A recommendation system for substitute products is needed across multiple touchpoints to prevent lost sales from out-of-stock items.

#2about 2 minutes

Analyzing the limitations of the existing recommender

The previous system, based on the Jaccard coefficient, produced low-quality recommendations, particularly for new or unpopular items.

#3about 5 minutes

Using the Prod2Vec algorithm for recommendations

The Prod2Vec algorithm, adapted from Word2Vec, learns product relationships by analyzing co-occurrence within user session context windows.

#4about 2 minutes

Improving predictions with Meta-Prod2Vec and metadata

Incorporating product metadata like category and brand into the model (Meta-Prod2Vec) significantly improves recommendation quality for long-tail items.

#5about 2 minutes

Implementing the end-to-end MLOps pipeline

The production system uses dbt for data transformation, a Vertex AI pipeline for model training, and Elasticsearch for efficient vector similarity search.

#6about 3 minutes

Evaluating model performance with offline and online metrics

Offline metrics like NDCG confirmed model quality, while mirror traffic analysis showed a 45% increase in product recommendation coverage.

#7about 3 minutes

Visualizing product relationships with embedding projector

Using TensorFlow's Embedding Projector tool reveals how the model groups similar products into distinct clusters in a high-dimensional space.

#8about 3 minutes

Adopting pragmatic baselines and automated data analysis

Key project takeaways include using simple business-logic baselines for benchmarking and automating exploratory data analysis within the ML pipeline itself.

#9about 1 minute