According to Deloitte, 79% of enterprises are building AI agents, yet only 11% have gotten them into production*. These statistics reflect what I hear in talking to enterprise CIOs.
The problem is that while an AI agent prototype can be created quickly and demos well, the path to making it production-grade has real potholes. And they may not be the potholes you would expect.
A common failure mode is to blame the models. However, that pursuit leads to a dead end.
When you talk to deep practitioners, they point to the data. Garbage-in, garbage-out. In the enterprise, data lives inside the firewall, something the model has never seen or been trained on.
Today at MongoDB.local London, we shipped capabilities into the MongoDB data platform built directly on that thesis, and we’re just getting started.
Why the 89% are stuck
Agents don’t fail because of the LLM. They fail because they can’t retrieve the right context, can’t remember a conversation, and can’t do either reliably at scale.
Think of a new hire in customer support. You interviewed them, so you know they’re smart. They join, and on day one, they flounder. Why? Because they don’t have the context of your business, they haven’t read up on your knowledge bases, haven’t had previous customer conversations, or read discussions. All their learning has been, understandably, outside the firewall. Their brain—their LLM—has been trained using general publicly available data.
If you want to put them to work effectively, you have to give them the context, the memory, the knowledge.
But to be effective, you know that they need more than just your company’s memory. They also need to be sharp enough to quickly and accurately sift through the mountains of company data and pull the right information into context for the problem they are trying to solve.
These steps require being able to synthesize and richly structure all that information into memories (which a document model is perfectly suited for) and retrieve the right context for the right question at the right time (where search, vector search, embedders, and rerankers shine).
That’s the difference between a customer service AI agent prototype that was fed context to demo so it can resolve a password reset request, and a production-grade support agent that can field any question from all available knowledge bases and previous customer service discussions.
This is unintuitive for a few reasons. Model leaderboards are loud and public. But when it comes to memory & context, embedding and reranker leaderboards live deep in technical communities that most product leaders never see. Models get the headlines. Suboptimal retrieval shows up in the bug reports, when agents make wrong decisions based on inaccurate or stale data, or worse, when systems hit performance walls and can’t scale beyond a proof of concept.
When teams hit those failures in production, the instinct is to upgrade the model. We’ve watched a lot of teams take that path. It rarely works, because the failure wasn’t usually the model. It was the data underneath.
Memory: An agent that forgets isn’t an agent
Take memory first. Zomato made its customers a promise: no matter how complex your issue, the agent already knows your history. It doesn't matter if you start in the app, switch to chat, or escalate to a live agent. The context follows you.
The challenge was that Zomato's customer support operation was scattered across 10 disparate platforms, with no unified view of the customer. The agent started from zero. Every time.
MongoDB gave Zomato a single persistent memory layer across their entire customer experience platform. 15 million conversations a month, support costs cut from $20 million to $9 million. The agent always knows what happened before it speaks.
Zomato’s team did that work in a world where most enterprise agent stacks couldn’t depend on memory at all.
Today’s launches change that for everyone. LangGraph.js Long-Term Memory Store, generally available today, brings persistent, cross-conversation memory to millions of developers building agents in JavaScript and TypeScript with LangGraph. Long-term memory across sessions, across users, across time, with Atlas as the unified backend for checkpointing and semantic recall, powered by Voyage AI embeddings.
This sits alongside our first-class integrations with the agent frameworks our customers build on—Crew, Mastra, Spring AI, Semantic Kernel, LangGraph in Python, and now JavaScript. Whichever framework your teams choose, MongoDB is the memory and state layer underneath it.
Retrieval: The infrastructure that was slowing you down
Memory is half of the data-layer story. Retrieval is the other half, and it has historically been the most expensive half to get right. Best-in-class retrieval has meant tedious pipeline engineering before a single user benefits. Pick a model. Wire an external API. Generate embeddings. Sync them as your data changes. Monitor the pipeline. Maintain it. That can consume a full quarter of engineering work running in parallel to your actual product. For most enterprises, it’s the gating step on every AI initiative, not the model.
Delivery Hero is the world's leading local delivery platform, operating in over 70 countries with a mission to get groceries and essentials to customers' doors in under 30 minutes. When a delivery driver picking an order finds an item out of stock, the app surfaces substitute recommendations for the customer to approve, with a hard requirement: the entire round-trip from query to customer cannot exceed one second. With around 10% of inventory being fast-moving perishable produce, across a catalog of more than 100 million items, this scenario plays out constantly.
Their original tool pre-computed recommendations in batch runs every 24 hours. By the time a rider flagged an item as unavailable, the embeddings behind those recommendations were already stale, meaning substitutes could themselves be out of stock, discontinued, or simply wrong for that customer. Delivery Hero rebuilt on MongoDB Vector Search, keeping embeddings updated continuously alongside live inventory and customer preference data, all within the same platform. Today, when a rider finds an item unavailable, the system surfaces up to 20 relevant substitutes in under a second, because the data behind those recommendations is never behind in reality.
Until now, generating embeddings meant leaving your database and building a pipeline to an external service. Automated Voyage Embeddings in MongoDB Vector Search, in public preview today, changes that. Embeddings are now a native data transformation inside MongoDB—pick a Voyage AI model, create an index, and as your data changes, embeddings stay in sync automatically. Something that previously took weeks to wire up now takes minutes. There is no pipeline to monitor, no separate service to maintain. The infrastructure that gated semantic search stops being infrastructure that our customers have to build.
That’s the Voyage AI thesis we’ve held onto since the team joined MongoDB a year ago. Voyage AI’s embedding and reranker models are consistently first on the Retrieval Embedding Benchmark (RTEB). Bringing that quality into Atlas, with a managed pipeline, is what closes the gap between best-in-class retrieval on a benchmark and best-in-class retrieval in a customer’s production environment.
Making a new skillset visible
The builders developing production AI on MongoDB are acquiring a new skillset, and we want to make it more visible and accessible. Three new MongoDB AI Skill Badges launched today: Voyage AI on MongoDB, Vector Search Performance, and Memory for AI Applications. Each maps directly to a capability we shipped this morning. Each is verifiable and shareable. MongoDB University has credentialed hundreds of thousands of builders globally. These badges are how the next set will get there.
What infrastructure for autonomous AI looks like
The 11% of enterprises stuck in pilot won’t move because someone ships a smarter model. It will move because more enterprises solve their data problem.
The infrastructure for autonomous AI doesn’t look like a smarter LLM. It is a data platform. It has memory that persists across sessions, channels, and frameworks—the way Vodafone’s customer context now travels with the customer. It can do retrieval that gets more accurate every time the underlying models improve, with nothing for your team to rewire—the way the FT’s million-plus daily searches keep getting smarter without a replatform. It provides a single source of truth for every agent your business runs.
That's the data platform we're building. What we shipped today is the evidence, and we're just getting started.
For the architectural foundation underneath today's announcements, read my colleague Ben Cefalo's companion post, AI Is Changing What Customers Need From a Database. MongoDB 8.3 Is Built for It.