EVENTGet 50% off your ticket to MongoDB.local NYC on May 2. Use code Web50! Learn more >

What is Generative AI

Throughout 2023, our world has seen a transformative shift with the rise of generative AI (often referred to simply as GenAI). As revolutionary applications for AI generated content have emerged, so too has the sentiment that generative AI will radically impact every industry and sector of society.

Organizations are racing to capture the potential of generative AI. If that includes your organization, then your first order of business is to understand the intricacies of generative AI. In this post, we’ll cover the following key questions (and more) to help you get your bearings:

After tackling these questions, we’ll look at how MongoDB can help organizations looking to build applications powered by generative AI.

Let’s start with a primer on the basics.

What is generative AI, and how does it work?

Generative AI refers to the branch of artificial intelligence that focuses on the creation of new, unique content — such as text, visual art, music, software code, and more. Unlike predictive (or analytical) AI — which uses machine learning to analyze historical data, identify patterns or trends, and then make predictions — generative AI goes a step beyond simply analyzing and predicting; generative AI creates.

For example, consider a predictive AI tool that is trained on data consisting of millions of paintings and their artists. Given a painting it has never seen before, the predictive AI tool may be able to determine the artist. However, a generative AI system can produce a new painting in the specific style of that artist.

An AI-generated image, the result of feeding the prompt "a painting of a computer in the style of Mondrian" to DALL-E

An AI-generated image, the result of feeding the prompt "a painting of a computer in the style of Mondrian" to DALL-E

Generative AI is often designed to mimic human intelligence and creativity, meaning that the content generated is contextually relevant and coherent. The entirely new content resonates with human patterns of thought and expression. It may be visual elements and AI art that are nearly indistinguishable from human-created content. The outputs of a generative AI tool may be text or speech generation. Regardless, the outputs are familiar but original, innovative while authentic.

By creating contextually relevant content through reasoning, generative AI capabilities can be applied to business tasks such as strategic planning and forecasting, problem solving, and what-if analysis.

Types of generative AI models

AI models are sets of AI algorithms that use machine learning to identify patterns from data, allowing them to make predictions or generate novel data that mimics the structure and style of the original data. The AI space is populated by many different types of models, with the most well-known today in generative AI being the foundational model.

Foundation models are pre-trained on massive amounts of data. The model serves as a “foundation” that can be tuned for specialized tasks. This makes a foundation model incredibly versatile, able to turn its hand to many different tasks. One example of a foundation model is the large language model (LLM). OpenAI's GPT (which stands for "generative pre trained transformer") is a large language model designed to work with human language. Large language models focus on natural language processing and can perform conversational tasks like question-answering, chatbots, transcription, translation, and more.

Other types of foundation models may focus on non-textual content. These include visual foundation models which generate images, such as Flamingo or OpenAI’s DALL-E, or audio foundation models such as UniAudio or LLark.

What is retrieval-augmented generation (RAG)?

An LLM is limited to the information available up to its last training update, so it doesn’t know about events or developments that have occurred since then. So, how might we leverage large language models in a way that takes into account new data?

One option is to re-train or fine-tune generative models with new data. However, this can be time- and resource-intensive. A better option is retrieval-augmented generation (RAG). RAG allows an LLM to dynamically pull in external, real-time information during the content generation process. With RAG, a generative AI system queries a database of information in real time, thereby producing more accurate, informed, and contextually relevant outputs — even if the required knowledge was not part of the data originally used for training.

However, for RAG to retrieve relevant, semantically similar information from a large corpus of data efficiently, it relies on vector embeddings — numerical representations of data in a high-dimensional space. The optimal way to store and query these embeddings is to use a vector database.

Photo by Alina Grubnyak on Unsplash

Photo by Alina Grubnyak on Unsplash

RAG broadens an LLM’s ability to stay up-to-date and versatile in generating high-quality content. A quote from this post sums up RAG well:

RAG fills in the gaps in knowledge that the LLM wasn’t trained on, essentially turning the question-answering task into an “open-book quiz,” which is easier and less complex than an open and unbounded question-answering task.

The importance of generative AI in the AI space

The role of generative AI in novel content creation brings transformative potential to all things AI. Generative AI models can have applications across all industries, from entertainment to healthcare. Innovations in AI research and AI technology are continually pushing the envelope of capabilities and applications for generative AI models. Soon, generative AI capabilities will become an essential part of the modern AI toolkit.

Using generative AI for image generation is made possible by GPT, coupled with diffusion models such as stable diffusion. As a result, AI art has become a huge market, with artists using generative AI to create realistic images, nearly indistinguishable from natural images.

Meanwhile, marketers use generative AI to create 180-character tweets about sales events, and designers use generative AI to create new product designs. Even pharmaceutical companies are using generative AI to assist in drug discovery.

An image of books on a book shelves.

Photo by CHUTTERSNAP on Unsplash

The role of data in generative AI

The effectiveness and versatility of any AI system, and this includes generative AI systems, depends on the quality, quantity, and diversity of data used to train its models. Let’s look at some key aspects of the relationship between data and the generative AI model.

Training data

Generative AI models are trained on massively large datasets. A model designed for text might be trained on billions of articles, while another model designed for images might be trained on millions of pictures. Large language models require vast amounts of machine learning training data if they are to generate coherent and contextually relevant content. As data is more diverse and comprehensive, the model’s ability to understand and generate a wide range of content improves.

Generally speaking, more data translates to better model outputs. With a larger dataset, generative AI models can identify more subtle patterns, resulting in more accurate and nuanced outputs. However, the quality of the data is also extremely important. Oftentimes, a smaller, high-quality dataset can outperform a larger, less relevant one.

Raw and complex data

Raw data, especially if it is complex and unstructured, may require preprocessing in the early stages of the data pipeline, before it can be usable for training. This is also the time when data is validated, to ensure it is properly representative and free from bias. This validation step is crucial for avoiding skewed or biased outputs.

Labeled data versus unlabeled data

Labeled data provides specific information about each data point (for example, textual description accompanying an image), whereas unlabeled data doesn’t include annotations like this. Generative models often work well with unlabeled data, as they are still able to learn how to generate content by understanding inherent structures and patterns.

Proprietary data

Some data is unique to a particular organization. Examples include customer order history, employee performance metrics, and business processes. Many enterprises collect this data, anonymize it to prevent sensitive PII or PHI from leaking downstream, and then perform traditional data analysis. This data holds a wealth of information that could be mined even more deeply if used to train a generative model. The resulting outputs would be tailored to the specific needs and characteristics of that business.

The role of data in RAG

As mentioned above, RAG combines the power of an LLM with real-time data retrieval. With RAG, you no longer rely solely on pre-trained data. Instead, you can execute a just-in-time pull of relevant information from external databases. This ensures that the generated content is current and accurate.

How to augment generative AI models with proprietary data

When working with generative models, prompt engineering is a technique that involves crafting specific input queries or instructions to guide the model, better tailoring the outputs or responses. With RAG, we can augment prompts with proprietary data, equipping the AI model to generate relevant and accurate responses with that enterprise data taken into account. This approach is also preferable to the time-consuming and resource-intensive approach of re-training or fine-tuning an LLM with this data.

Challenges and considerations

Of course, working with generative AI is not without its challenges. If your organization is looking to harness GenAI’s potential, you should bear in mind the following key issues.

Need for data expertise and massive compute power

Generative models demand substantial resources. First, you need the expertise of trained data scientists and engineers. With the exception of data organizations, most enterprises don’t have teams with the specialized skillset that would be needed to train or fine-tune LLMs

When it comes to computing resources, training a model on comprehensive data may require weeks or months—and this is even if you’re using powerful GPUs or TPUs. And although fine-tuning an LLM may not require as much computing power as training one from scratch, it still requires significant resources.

The resource-intensive training and fine-tuning of an LLM is what makes RAG an attractive alternative technique for incorporating current (and proprietary) data with the existing data available to a pre-trained LLM.

Ethical considerations

The rise of generative AI has also spawned intense discussion over the ethical considerations that come with its development and use. As generative AI applications become more mainstream and accessible to the public, conversations have centered around how to:

  • Ensure equitable and bias-free models
  • Protect against attacks like model poisoning or model tampering
  • Prevent the spread of disinformation
  • Guard against the misuse of generative AI (think deepfakes or generating misleading information)
  • Preserve attribution
  • Promote transparency with end users, so that they know when they’re interacting with a generative AI chatbot rather than a human
Comparison with other AI tools and systems

The hype and novelty of generative AI tools have eclipsed the broader AI landscape of tools and systems. Many mistakenly assume that generative AI is the AI tool to solve all their problems. However, while generative AI excels in creating new content, other AI tools might be better suited for certain business tasks. The benefits of generative AI should—just as with any tool in your stack—be weighed against the benefits of other tools.

RAG-specific challenges

The RAG approach to leveraging a large language model is powerful, but it comes with its own set of challenges as well.

  • Choosing vector database and search technologies: Ultimately, the efficiency of the RAG approach hinges on its ability to retrieve relevant data quickly. This makes the selection of a vector database and search technology a critical decision that will affect RAG performance.
  • Data consistency: Because RAG pulls data in real time, ensuring that the vector database is up-to-date and consistent is essential.
  • Integration complexity: Integrating RAG with an LLM adds a layer of complexity to your systems. Effectively implementing generative AI with RAG may require specialized expertise.

These challenges notwithstanding, RAG affords organizations with a straightforward and powerful means of tapping into their operational and application data to glean rich insights and inform critical business decisions.

MongoDB Atlas for GenAI-powered apps

We’ve touched on the transformative potential of generative AI, and we’ve seen the powerful enhancement of real-time data that comes with RAG. Bringing these technologies together requires a flexible data platform that offers a suite of features tailored for GenAI-powered applications. For organizations venturing into the world of generative AI and RAG, MongoDB Atlas will be the game-changer.

The core features of MongoDB Atlas include:

  • Native vector search capabilities: Native vector storage and search are built into MongoDB Atlas, ensuring quick and efficient data retrieval for RAG without the need for an additional database to handle vectors.
  • Unified API and flexible document model: The unified API from MongoDB Atlas allows developers to combine vector search with other query capabilities, like structured search or text search. This, coupled with MongoDB’s document data model, brings incredible flexibility to your implementation.
  • Scalability, reliability, and security: MongoDB Atlas provides horizontal scaling to easily grow as you (and your data) grow. With fault tolerance and simple horizontal and vertical scaling, MongoDB Atlas ensures uninterrupted service regardless of your workload demands. And, of course, MongoDB shows how it prioritizes security by enabling industry-leading data encryption that is queryable.
An image of MongoDB Atlas performing multiple tasks to help generative AI powered apps.

MongoDB Atlas is pivotal in simplifying the implementation of a RAG-enhanced LLM system. By handling your generative AI data services, MongoDB streamlines your process of building enterprise-ready, GenAI-powered apps. Whether the data you wish to incorporate is proprietary data or up-to-the-minute event data, MongoDB makes the RAG approach achievable. In a recent state of AI survey of 1,500 respondents, MongoDB Atlas Vector Search commanded the highest developer satisfaction amongst all vector solutions.

Conclusion

As a subset of artificial intelligence, generative AI — which uses models trained on vast amounts of existing content to create new, unique content — represents a transformative leap in modern technology. However, for generative AI to deliver on the promise to mimic human intelligence and creativity, it must be trained on large volumes of high-quality data. The effectiveness of a generative AI model depends on the quality, quantity, and diversity of its training data.

The data available to an LLM is bounded by the last training update for that LLM. Incorporating up-to-date data can not be accomplished through model re-training or fine-tuning because as soon as those processes are complete, the data is out of date.. The solution is RAG, which queries for up-to-date data from a vector database as part of the prompt engineering task. RAG enhances LLMs by providing them with the ability to access current, relevant information — which can include an organization’s proprietary information — without the resource-intensiveness of training or fine-tuning.

To make this possible, enterprises are looking to MongoDB Atlas. Its native vector search capabilities, coupled with its unified API and flexible document model, make it an attractive option for businesses looking to enhance an LLM with the RAG approach to pull in proprietary data.

Get Started With MongoDB Atlas

Try Free