Gen AI-Powered Inventory Classification

Use Generative AI and MongoDB Atlas Vector Search to incorporate unstructured data into inventory classification for better decision-making.

Use cases: Artificial Intelligence, Catalog

Industries: Manufacturing & Mobility, Retail, Healthcare

Products and tools: MongoDB Atlas Database, MongoDB Atlas Vector Search, MongoDB Node.js Driver

Partners: AWS, Amazon Bedrock, Anthropic, Cohere, Vercel

Solution Overview

Global automotive operations face compounding disruptions. Volatile geopolitics and the return of tariffs have delayed model-year transitions and created severe inventory shortages. As of June 2025, next-model-year vehicles consist of only 3% of US inventory. To navigate this constrained supply and protect margins, you need tools that go beyond simple financial metrics.

Traditionally, organizations rely on ABC analysis to segment inventory. This method prioritizes items solely based on dollar usage, where "Category A" drives the most revenue and "Category C" the least. While simple, this approach ignores critical variables such as lead time, durability, or obsolescence.

Figure 1. ABC analysis for inventory classification.

Multi-Criteria Inventory Classification (MCIC) improves this by adding quantitative data points, but it still suffers from a blind spot: unstructured data. Customer reviews, maintenance logs, and social sentiment account for 80% of global data, yet traditional models cannot process them. This solution bridges that gap. By combining Generative AI and MongoDB Atlas Vector Search, you transform qualitative feedback into actionable scoring features.

Figure 2. Transforming unstructured data into features for machine learning models.

This solution enables you to move from reactive inventory tracking to predictive, customer-centric decision-making. MongoDB empowers the next generation of AI-driven inventory classification through a four-step methodology:

Create and store vector embeddings from your unstructured data.
Design and store evaluation criteria relevant to your business goals.
Create an agentic application to perform data transformation based on those criteria.
Rerun the inventory classification model with the new features added.

Figure 3. Methodology and requirements for Gen AI-powered inventory classification.

Beyond automating the manual evaluation of qualitative criteria, this solution operationalizes your entire data strategy. By unifying vector embeddings, metadata, and operational data in a single platform, you eliminate the latency of disjointed batch analytics pipelines. You can process new SKUs the moment they arrive, enabling you to manage massive product catalogs with real-time precision and scale.

Reference Architectures

This architecture operationalizes the four-step methodology. This solution relies on a dynamic, agentic workflow where MongoDB Atlas serves as the data backbone.

Create and store vector embeddings

Ingest unstructured data, such as product reviews, supplier notes, or support transcripts, into MongoDB Atlas. You use an embedding model (such as those from Voyage AI) to vectorize this text. You then store the resulting embeddings directly alongside the original source text in your MongoDB documents. This unified approach reduces infrastructure complexity and enables you to run low-latency semantic searches through a single API.

Figure 4. Product reviews can be stored as vector embeddings in MongoDB Atlas.

Design and store evaluation criteria

Define classification rules based on your specific business objectives, including cost reduction, minimizing stockouts, or enhancing customer experience. Previously, this demanded extensive manual effort and deep expert knowledge to map these goals to data. Now, an AI agent automates and scales this process.

The agent analyzes your available data and context to propose the optimal parameters and data source combinations to meet your objectives. You store these dynamic definitions in MongoDB as flexible JSON documents. This enables you to apply consistent, informed decision-making across massive inventories and adapt instantly to changing business requirements.

Figure 5. Unstructured and structured data are used by the AI agent to create criteria for feature generation.

Transform data with an agentic application

In this step, a second AI agent calculates the actual scores for your inventory. The agent iterates through your product catalog and uses MongoDB Atlas Vector Search to retrieve specific customer reviews relevant to the criteria defined in Step 2.

The agent analyzes this retrieval set, calculates a numerical feature score, and updates the original product document with this new data. This capability enriches your dataset with qualitative insights that are now mathematically comparable to your quantitative metrics.

Figure 6. An AI agent enriches product features with vectorized review data to generate new features.

Rerun the inventory classification model

Incorporate these new features into your MCIC model. Domain experts can assign weights to these new AI-generated signals to balance them against traditional financial metrics. Rerun the classification algorithm to segment your inventory into informed categories that reflect both economic value and real-world customer sentiment.

Figure 7. Domain experts can rerun classification after balancing weights.

Data Model Approach

The MongoDB document model unifies diverse data types without rigid schema constraints. This capability simplifies how you represent complex data at scale. The following examples illustrate the data structures required for this agentic workflow.

Quantitative metrics

Typically, inventory transactions, such as orders, contain the raw data required to calculate standard MCIC metrics, such as Annual Dollar Usage, Average Unit Cost, Total Annual Usage, and Lead Time.

In relational systems, order data is often fragmented across multiple tables for headers, line items, and logistics. To optimize for read performance and simplify your application logic, you can store this data into a single orders collection.

Using the Extended Reference Pattern, you embed product information within the items list of the order. This approach enables you to retrieve the full context of a transaction in a single database operation.

{
  "_id": "order_55021",
  "status": "delivered",
  "purchaseTimestamp": { "$date": "2024-05-08T16:05:31.000Z" },
  "items": [
    {
      "price": 85.00,
      "productId": "part_9921_brake_pad",
      "productName": "Ceramic Brake Pads - Front Pair"
    }
  ],
  "reviews": [
    {
      "reviewId": "rev_7721",
      "score": 5,
      "commentTitle": "Great fit",
      "commentMessage": "Arrived on time and fit perfectly on my 2020 Sedan."
    }
  ]
}

Metrics derived from unstructured sources

Valuable inventory signals often exist in unstructured text such as maintenance logs, support tickets, or customer feedback. In this example, you can use reviews to perform semantic analysis by generating vector embeddings for the title and message content.

Store vector embeddings (emb) alongside the original text fields to enable hybrid searches using MongoDB Atlas Vector Search. Additionally, metadata such as the review score enables you to combine semantic queries (for example, finding reviews about "reliability") with structured filters (for example, "score": 5).

{
  "_id": "rev_99812",
  "productId": "part_9921_brake_pad",
  "score": 5,
  "title": "Excellent durability",
  "message": "I've put 20k miles on these pads and they still look new. Much better than OEM.",
  "emb": [0.02, -0.15, 0.44, 0.12, ... ]
}

Criteria Definition

A crucial step in this solution is defining flexible, data-driven classification criteria. Instead of relying on hard-coded rules, save criteria as "knowledge objects" in a criteria collection. An AI agent generates these definitions based on your business objectives (for example, "Durability") and the available data.

This document structure includes weights, explicit scoring scales, and data sources. This document structure provides a schema that the agent can use to consistently evaluate products across your inventory.

{
  "criteriaName": "Durability",
  "criteriaDefinition": "Measures how customers perceive the product’s durability relative to their expectations.",
  "elements": [
    {
      "name": "Expected Durability",
      "weight": 0.30,
      "description": "The level of durability customers believe the product should have based on price and category."
    },
    {
      "name": "Perceived Durability",
      "weight": 0.40,
      "description": "How customers describe the actual durability, build quality, and sturdiness after usage."
    }
  ],
  "scoringScale": [
    {
      "description": "Highly durable item that meets or exceeds expectations, strongly positive sentiment",
      "score": 1
    },
    {
      "description": "Low durability item that fails to meet expectations, negative sentiment",
      "score": 0.01
    }
  ],
  "dataSources": ["inventory", "reviews"]
}

Build the Solution

To demonstrate this methodology in action, the team built a simple application that executes the concepts presented in the preceding steps. This demo operationalizes the agentic workflow, enabling you to experience the transition from traditional MCIC to AI-augmented classification.

You can access the full source code and documentation within the GitHub Repository.

Figure 8. Demo application high-level architecture.

Follow these steps to set up the application and explore the agentic workflow:

Initialize the database: Create a MongoDB Atlas cluster and seed the database with the provided inventory and review data.
Set up the environment: Clone the repository, configure your .env.local file with MongoDB Atlas and AWS Bedrock credentials, and start the app using npm run dev.
Run traditional analysis: Select a standard quantitative criteria like Annual Dollar Usage in the left panel and click Run Analysis to establish a baseline classification.
Define new criteria: Click Add new criteria and describe a business objective such as “Identify products with high customer loyalty”. The agent proposes a structured definition and data sources.
Generate scores: Click Generate. The agent iterates through your inventory, analyzes unstructured data, and assigns scores to each product.
Refine classification: Incorporate the new criteria into your selection, adjust the weights, and click Run Analysis again to see how qualitative insights shift your inventory categories.

See the demo application in action in the following section.

Figure 9. Inventory classification using generative AI.

Key Learnings

This solution demonstrates how to modernize inventory classification by combining generative AI with the flexibility of the document model. As you implement this architecture, consider these core benefits:

Unlock hidden inventory value: Traditional financial metrics miss critical insights contained in text. By vectorizing unstructured data such as customer reviews and maintenance logs, you transform qualitative feedback into quantitative features that improve classification accuracy.
Automate criteria generation: Manual rule-setting is slow and rigid. An agentic workflow enables you to dynamically generate and score evaluation criteria based on high-level business objectives. This scales expert decision-making across massive product catalogs.
Simplify data architecture: Disjointed systems create latency. By storing operational data, metadata, and vector embeddings in a single MongoDB document model, you eliminate complex Extract, Transform, and Load (ETL) pipelines and enable real-time analysis of new inventory items.
Enhance decision quality: Financial metrics alone lead to gaps in insight. Integrating customer sentiment and product reliability scores creates a holistic view of inventory value, enabling you to prioritize high-impact items that traditional ABC analysis ignores.

Authors

Humza Akhtar, MongoDB
Rami Pinto Prieto, MongoDB
Daniel Jamir, MongoDB

Learn More

Back

Agentic Voice Assistant for Airport Operations

Automotive Diagnostics