Search is ubiquitous in application experiences. Whether we are shopping for groceries or buying a new home, browsing the web to find answers to our burning questions, servicing our customers, looking for our next job, or seeking suggestions for our next vacation, the search bar helps us navigate and discover the most relevant information — all in a way that seemingly interprets our natural language. People now expect these same intuitive search experiences in every application they use, whether at home or at work.
However, building these experiences is hard. In many cases, developers have to ‘bolt-on’ a search engine like Elasticsearch to their database and create a replication mechanism to keep the two systems synchronized. This approach introduces a huge amount of complexity to the application stack, reducing developer velocity while driving up risk, complexity, and cost.
Elasticsearch is a distributed search and analytics engine built on top of Apache Lucene and developed by Elastic. It extends Lucene’s indexing and search functionalities using RESTful APIs, and it achieves the distribution of data on multiple servers using the index and shards concept. Elasticsearch is based on JSON and is suitable for search use cases against time series data, structured or unstructured text, numerical data, or geospatial data.
Deployment options include self-hosting Elasticsearch, where the user is self-managing their instance, and Elastic’s cloud-hosted variant included in Elastic Cloud (which includes the rest of the ELK stack).
MongoDB Atlas Search makes it easy to build fast, relevant, full-text search on top of your data in the cloud. By embedding an Apache Lucene search engine directly alongside your database, data is automatically synchronized between the two systems, developers work with a single driver and API, there is no separate system to run and pay for, and everything is fully-managed for you. It combines the power of Apache Lucene with the developer productivity, scale, and resilience of the MongoDB Atlas database.
With just a couple of API calls or clicks in the Atlas UI, you instantly expose your data to sophisticated, relevance-based search experiences that boost engagement and improve customer satisfaction. Your data is immediately more discoverable, usable, and valuable - while it’s all fully managed for you in the cloud, removing operational burden. Customers have reported 30% to 50% improvements in time to market for new application functionality by adopting Atlas Search.
If a database’s internal search features are not adequate to satisfy the desired user experience, then another option is to bolt on a dedicated search engine, such as Elasticsearch, alongside the database.
This provides the search features demanded by customers, but it does so while imposing additional constraints on developers and ops teams while driving up data duplication and technology sprawl.
A bolt-on specialized search engine alongside your database mandates synchronizing data between the two systems. While users get the rich search experience they expect, this comes at a significant cost. The application stack gets more complex and unwieldy. All of this translates to reduced developer velocity, compromised customer experience, and escalating costs.
How search works with a bolt-on solution:
To surface relevant and up-to-date search results, the database and search engine need to be kept synchronized, duplicating data between systems.
This means engineering teams need to create a synchronization mechanism that replicates data from the database to the search engine. Typically they will create a data pipeline with custom filtering and transformation logic built on top of messaging systems such as Apache Kafka, or using packaged connectors from specialized providers. Whether building or buying, the process takes time and adds ongoing costs. The synchronization mechanism also has to be deployed onto its own nodes, creating additional hardware sprawl.
Once the synchronization mechanism has been deployed, it needs to be monitored and managed, adding more engineering overhead.
It is important that replication to the search engine keeps pace with database writes so that search results do not excessively lag the database and break application SLAs. Monitoring the replication process is necessary to identify and remediate synchronization issues. This becomes especially complex if the search index falls so far behind the database that it has to be resynced from scratch, causing potential application downtime. It is not uncommon to find that 10% of engineering cycles are lost to manually recovering synchronization failures.
New application features that necessitate changes to the database’s schema often need both the synchronization logic and the search engine schema to also be updated at the same time. This creates more dependencies that slow down the pace of rolling new features to production.
Atlas Search is built on top of MongoDB, the most popular and widely used modern database in the market. MongoDB has become so popular because engineering teams can build and ship applications faster than other data platforms. You can get started with both MongoDB Atlas and Atlas Search in minutes on a fully managed service that handles operations for you — on any cloud you choose.
By embedding an Apache Lucene search index directly alongside the database, data is automatically synchronized between the two, developers work with a single API, there is no separate system to run and pay for, and everything is fully managed for you, relieving operational burden. The MongoDB developer data platform radically simplifies your data architecture, enabling you to gain a competitive advantage by innovating faster while reducing cost, risk, and complexity.
With a distributed architecture, your database and search engine is resilient and globally scalable. Replication with self-healing recovery keeps your applications highly available while giving you the ability to isolate operational and search workloads on separate nodes within a single cluster. Native sharding provides elastic and application-transparent horizontal scale-out to accommodate your workload’s growth, along with geographic distribution for data residency controls. These controls ensure that data is kept close to users for low latency and to comply with data sovereignty mandated by modern privacy regulations.
Atlas Search is part of MongoDB Atlas, the multi-cloud developer data platform that combines transactional processing, relevance-based search, real-time analytics, mobile edge computing with cloud sync, and a cloud data lake in an elegant and integrated data architecture. Through a flexible document data model and unified query interface, Atlas provides a first-class developer experience to power almost any class of application. At the same time, it meets the most demanding requirements for resilience, scale, and data privacy.
With search engines storing and querying data, some engineering teams may consider eliminating the database altogether and just using the search engine for data persistence. At first glance, this would address many of the constraints discussed above, presenting a single system to develop against and to operationalize, while eliminating the overhead of data synchronization.
But as noted earlier, databases and search engines are different technologies designed to do different things.
Beyond serving application queries, databases are designed around a core set of data persistence and processing capabilities. These demand data integrity, consistency, and durability; balanced performance across reads and writes; concurrency; availability; security; disaster recovery; and more.
With a specialized architecture and indexing focused on fast, relevance-based information retrieval, dedicated search engines have a different set of design goals that compromise many of the capabilities that make databases so essential.
As discussed above, Elasticsearch is a capable search engine technology. However, its core system architecture is built around Lucene indexes in a way that forces compromises in many core database capabilities in order to meet its primary design goal as a scalable search engine.
It is critical in today’s digital economy for developers to build and evolve applications at speed. Introducing a separate search engine, like Elasticsearch, alongside the database means developers now have two separate systems they need to work with, which slows them down.
With this approach, developers have to learn how to work with two entirely different query languages to access the database and the search engine. This increases their learning curve and means frequent context switching when building application functionality, both of which impact their productivity while complicating testing and ongoing maintenance.
Because this approach requires two different APIs/drivers, application dependencies become much more complex, reducing the pace and frequency of releasing applications to production.
Doubling up with a database and separate search engine such as Elasticsearch also adds time, cost, and complexity to operations and site reliability engineering (SRE) teams.
Now they have an additional system in their technology stack that needs constant care and feeding: It has to be provisioned, secured, monitored, scaled, patched, and backed up with its own tooling and APIs. It also means working across multiple vendors, making issue resolution more complex. Every new project means another dataset living in its own silo, adding to data sprawl and governance overhead.
..for the Developer
The document data model is intuitive and flexible. Documents map directly to the objects in your code so they are much easier and more natural to work with. You can store, index, and search data of any structure and modify your schema at any time as you add new features to your applications.
You work with data as code. The MongoDB Query API and drivers are idiomatic to your programming language. Ad hoc queries, indexing, full-text search, and real-time aggregations provide powerful ways for accessing, grouping, transforming, searching, and analyzing your data to support any class of workload.
..for IT Operations
By embedding an Apache Lucene search index directly alongside the database, data is automatically synchronized between the two. This means engineers and administrators work with a single API, there is no separate system to run and pay for, and everything is fully managed for you, relieving operational burden. The MongoDB developer data platform radically simplifies your data architecture, enabling you to gain a competitive advantage by innovating faster while reducing cost, risk, and complexity.
The search bar is the primary interface for users to navigate the product catalog or content metadata.
Customers using Atlas Search for catalog and content search include Keller Williams – one of the world's largest real estate agents; a global auto-retailer that replaced Elasticsearch for its parts catalog; and CNFT.IO, the first and largest NFT marketplace trading on the Cardano blockchain.
Line-of-business applications supporting internal users or customer self-service portals where search is a supporting function used to enhance information discovery.
Atlas Search customers with these use cases include Current – one of the United States’ fastest growing challenger banks, and a multinational convenience store chain for inventory management and customer self-scan checkout systems.
Users interact with the single view via search as a supporting function. The single view application itself relies on specific Atlas Search capabilities such as fuzzy matching and autocomplete to query disparate data ingested from multiple sources into the single, 360-degree view.
Atlas Search customers powering single view include a global top 10 insurer and one of Europe’s largest energy providers.
The above examples demonstrate how Atlas Search is designed for application search use cases. By design, it is tightly integrated with the MongoDB Atlas platform. Therefore all data has to first be stored in MongoDB database collections in order to then create the required search indexes against it.
Atlas Search is not currently designed for log analytics typically used in DevOps observability or security and threat hunting applications. Atlas Search is also not suitable for enterprise-wide search systems. In these scenarios, Elasticsearch provides built-in connectors and agents to crawl and extract data from multiple internal source systems, index them, and then make data and analytics searchable with bespoke tools.
For these use cases, it can be better to use MongoDB as one of your data sources alongside your existing Elasticsearch search engine.
MongoDB Atlas offers a forever-free tier for development. Once deployed, simply click a button to add search to your application. With unlimited time to explore, see for yourself how a fully managed search engine integration helps your team build applications faster
Atlas Search is available with all Atlas clusters — including free clusters — so you can evaluate it at no cost.
Our Getting Started tutorial steps you through the process. Atlas Search documentation provides a complete reference on how to configure, manage, and query search indexes, along with performance recommendations. The MongoDB Developer Hub and MongoDB YouTube channel provide a wealth of articles and tutorials for beginners through to expert users.
If you currently employ Elasticsearch as a bolt on to your database, we have developed a 5-step methodology to help you migrate away from the headache of managing two independent schemas and data sets.
The guide steps you through how to:
The guide wraps up with examples of customers that have made the switch and provides guidance on how to get started with Atlas Search, along with key services that can help you in your journey.
Full-text search is a technology used to efficiently query, filter, and display matching data from vast corpuses of stored information. Despite its name, almost any type of data can be searched – it’s not just text!
A well implemented full-text search solution delivers fast and relevant search experiences to your application’s users. It boosts their engagement and improves satisfaction by making data more discoverable, usable, and valuable.
The key to full-tech search is the inverted index – a specialized structure for indexing and storing data that’s optimized for efficient search queries. Think of an inverted index as a glossary that lists all the unique values that appear in a document. Each value has a list of the documents in which it appears and the value’s position within that document.
Databases are ideal when you know exactly what information you are looking for, i.e., show me all customers that have created an Atlas Search index in the past month. When developers know these queries up-front, the appropriate indexes can be defined on the data.
Full-text search is ideal when your users' queries are more open-ended, and they are open to suggestions, i.e., show me the most valuable articles on Atlas Search. In these instances, inverted indexes are much more efficient at returning the most relevant results.
Full-text search engines also offer additional features that databases either do not, or that they do not do well, i.e., fuzzy search (typo-tolerance), autocomplete (typeahead and presenting suggested search terms), faceted navigation (displaying categories of search results), synonyms (to define similar search terms), custom scoring (to tune the relevance of search results), analyzers (to control how data is indexed).
Both have different use cases. When it comes to application search scenarios, Elasticsearch is generally used as a 'bolt-on' search engine, while MongoDB is a developer data platform that allows three systems to be compressed into a single solution.
By integrating a database, search engine and sync mechanism into a single unified and fully managed platform, Atlas Search is the fastest and easiest way to build relevance-based search directly into your applications.
Atlas Search is priced in the same way as other data storage and compute resources are calculated for Atlas. There is no separately priced SKU and it consumes the same Atlas Credits.
Details of how Atlas is billed can be found here.
Additionally, MongoDB Atlas Search is available via the AWS Marketplace, Google Cloud Platform Marketplace, and Microsoft Azure Marketplace. In each case, these help streamline procurement to align with your existing Cloud Service Provider's services.
MongoDB offers legacy text search that supports basic queries on string content. However, text indexes only work for text-based content and the $text operator can be modified in limited ways.
In comparison, MongoDB Atlas Search offers:
● Better results: Atlas Search can provide more ways to fine-tune the relevance of search results and support faster query results because it’s based on Apache Lucene, the open source search engine that also powers Elasticsearch and Solr.
● Rich feature set: Atlas Search offers support for over 35 languages, multiple data types, fuzzy search, autocomplete, synonyms, custom scoring, index intersection, highlighting, and multiple search indexes per collection.
● Defined roadmap: Atlas Search is constantly being improved with new features that help us increase the range of use cases you can target with Atlas Search.