How We Built a Machine Learning-Based Recommendation System (And Survived to Tell the Tale)
How do you find the perfect substitute for an out-of-stock item? Learn how we adapted a natural language model to solve this critical e-commerce challenge.
#1about 5 minutes
Defining the business need for product recommendations
A recommendation system for substitute products is needed across multiple touchpoints to prevent lost sales from out-of-stock items.
#2about 2 minutes
Analyzing the limitations of the existing recommender
The previous system, based on the Jaccard coefficient, produced low-quality recommendations, particularly for new or unpopular items.
#3about 5 minutes
Using the Prod2Vec algorithm for recommendations
The Prod2Vec algorithm, adapted from Word2Vec, learns product relationships by analyzing co-occurrence within user session context windows.
#4about 2 minutes
Improving predictions with Meta-Prod2Vec and metadata
Incorporating product metadata like category and brand into the model (Meta-Prod2Vec) significantly improves recommendation quality for long-tail items.
#5about 2 minutes
Implementing the end-to-end MLOps pipeline
The production system uses dbt for data transformation, a Vertex AI pipeline for model training, and Elasticsearch for efficient vector similarity search.
#6about 3 minutes
Evaluating model performance with offline and online metrics
Offline metrics like NDCG confirmed model quality, while mirror traffic analysis showed a 45% increase in product recommendation coverage.
#7about 3 minutes
Visualizing product relationships with embedding projector
Using TensorFlow's Embedding Projector tool reveals how the model groups similar products into distinct clusters in a high-dimensional space.
#8about 3 minutes
Adopting pragmatic baselines and automated data analysis
Key project takeaways include using simple business-logic baselines for benchmarking and automating exploratory data analysis within the ML pipeline itself.
#9about 1 minute
Understanding the project team and final timeline
The project was completed in nine months by a cross-functional team of data engineers, data scientists, and software developers.
Related jobs
Jobs that call for the skills explored in this talk.
Data Science & more: The Lopez dilemmaCatwalk, Data Science, Hollywood, Google Images, Haute Couture, StackOverflow, Comfort Zone, Dota 2 and Versace – all these topics are connected and influenced by each other. Read here how and why!In 2000 Jennifer Lopez's green Versace dress went vi...
Daniel Cranney
Dev Digest 213: Petrol Prices, Agentic Workflows, AI Skills and CODE100!Inside last week’s Dev Digest 213 .
🤫 Don’t tell your LLM that it is an expert
👻 AI generated code is invisible
🔄 Learn about agentic workflows
🛡️ Linux Foundation sponsors fight against AI slop
🦠 1M users infected by Chrome extension
🫃 The why of J...
Chris Heilmann
SEO in an AI world - Google vs. ChatGPT and survival tips for content creatorsIn the ever-evolving world of technology, the landscape of search engines and AI tools is shifting at an unprecedented pace. This transformational journey is being shaped by the rising influence of AI-powered tools like ChatGPT, which are increasingl...