If you can’t find it, it might as well not exist — that’s the takeaway for all organizations providing content or products to their users. It’s become a practical requirement to allow ways to search for and discover data records, or “documents,” amongst a large collection.
Thus, in this highly competitive world, applications must have search functionality so users can find what they are looking for as easily as possible, or they will assume you don’t have it and visit your competitors.
Implementing a great search feature deserves attention to detail. How lenient is the system to typos or voice misinterpretations? Can it provide context-sensitive guided/faceted navigation? Are matches highlighted for clear visibility? And ultimately, are the results relevant to the user’s query?
Beyond end-user keyword search, we also want to be able to perform analytics of our query and system logs so that we can make the right business decisions. Analytics encompasses security auditing and anomaly detection too. And hence, performing analytics on the data easily is another important aspect of building any software today.
What is Elasticsearch?
The Elasticsearch search engine provides powerful, scalable search and analytics capabilities for all types of data, including structured and unstructured text, numerics, vectors, and geospatial shapes and coordinates. It is used for full-text search, structured search, analytics, and many forms of data exploration. The code itself lives within an open-source project and the company behind it provides a fully supported commercial offering for on-premise deployment or SaaS hosting.
Built upon the well-known Apache Lucene search engine library, Elasticsearch is able to quickly provide search results over large volumes of data. Lucene indexes content into, primarily, an inverted-index structure, which facilitates quickly matching queries to content. An inverted index is built by tokenizing text into terms (which are usually single words) and building a lexicographically ordered data structure, as in the diagram below.