Deliver personalized, accurate search experiences using machine learning and Vector Search.
Use cases: Generative AI
Industries: Retail
Products: MongoDB Atlas, MongoDB Atlas Search, MongoDB Atlas Vector Search, MongoDB Change Streams, MongoDB Connector for Spark
Partners: Databricks
Solution Overview
This solution integrates MongoDB's flexible data model with Databricks' advanced analytics to create AI-augmented search capabilities for online retail platforms, offering a more intuitive and efficient shopping experience.
By using AI, machine learning, and MongoDB Atlas Vector Search, you can personalize your customers' shopping experience and suggest products that align with their preferences and search history. This enhances customer satisfaction and drives sales.
In this solution, MongoDB Atlas offers a scalable database environment that efficiently manages large and diverse e-commerce datasets. MongoDB Atlas Vector Search processes complex search queries to ensure that customers find what they're looking for, even with partial search terms. Finally, Databricks offers powerful machine learning and real-time analytics capabilities, which enhances Vector Search's accuracy. You can apply this framework to other industries such as financial services, healthcare, and insurance.
Reference Architectures
The following two diagrams display the solution architecture. The first diagram shows the overall solution architecture, and the second shows the Vector Search portion of the solution in more detail. These architectures work together in this solution.
Figure 1. Architecture of an AI-enhanced search engine with different MongoDB Atlas components, Databricks notebooks, and data workflows.
Figure 2. Architecture of a vector search solution showcasing how the data flows through the different integrated components of MongoDB Atlas and Databricks
Data Model Approach
When working with this solution, developers should use the Polymorphic Pattern when storing their data. This pattern allows for efficient queries even when documents within a collection share a similar, but not identical, structure.
In this solution, each product document has common fields such as _id
, price
,
or brandName
. They can also have different fields that are specific to the
product category, such as color1
, ageGroup
, or season
. Because
of MongoDB's flexible document data model, you can design your schema to support
both uniformity and customization when representing different product types within
the same collection.
The following code block provides an example of a document that represents a product item:
1 { 2 "_id": { 3 "$oid": "64934d5a4fb07ede3b0dc0d3" 4 }, 5 "colour1": "NA", 6 "ageGroup": "Adults-Women", 7 "link": "http://assets.myntassets.com/v1/images/style/properties/41b9db06cab6a17fef365787e7b885ba_images.jpg", 8 "brandName": "Baggit", 9 "fashionType": "Fashion", 10 "price": { 11 "$numberDouble": "375.0" 12 }, 13 "atp": { 14 "$numberInt": "1" 15 }, 16 "title": "Baggit Women Chotu Taj White Belt", 17 "gender": "Women", 18 "mfg_brand_name": "Baggit", 19 "subCategory": "Belts", 20 "masterCategory": "Accessories", 21 "score": { 22 "$numberDouble": "0.0" 23 }, 24 "season": "Summer", 25 "articleType": "Belts", 26 "baseColour": "White", 27 "id": "33464", 28 "discountedPrice": { 29 "$numberDouble": "324.0" 30 }, 31 "productDisplayName": "Baggit Women Chotu Taj White Belt", 32 "count": { 33 "$numberInt": "10" 34 }, 35 "pred_price": { 36 "$numberDouble": "0.8616750344336797" 37 }, 38 "price_elasticity": { 39 "$numberDouble": "0.0" 40 }, 41 "discount": { 42 "$numberDouble": "14.0" 43 } 44 }
Build the Solution
Get started with Atlas Vector Search
Visit the Atlas Vector Search Quick Start guide and create your first index in minutes.
Deploy your application
To deploy your application locally, follow the README
instructions
in this GitHub repository.
Create Databricks jobs and workflows
To learn how to create Databricks jobs and workflows with JSON, see this Databricks documentation.
Use Databricks notebooks for analyis
See this GitHub folder for notebooks.
Key Learnings
Transform raw data: You can use triggers and functions to push raw data from MongoDB Atlas into Databricks. You can also leverage the MongoDB Connector for Spark to shape your data for different machine learning algorithms.
Process real-time data: You can process real-time data to get actionable insights, such as product scoring, product promotions, and recommendation engines.
Use a flexible schema: You can use MongoDB's flexible document model to apply the polymorphic pattern. This allows you to store documents with shared and unique fields within the same collection.
Authors
Francesco Baldissera, MongoDB
Ashwin Gangadhar, MongoDB
Vittal Pai, MongoDB