Get Started with the Semantic Kernel Python Integration
On this page
Note
This tutorial uses the Semantic Kernel Python library. For a tutorial that uses the C# library, see Get Started with the Semantic Kernel C# Integration.
You can integrate Atlas Vector Search with Microsoft Semantic Kernel to build AI applications and implement retrieval-augmented generation (RAG). This tutorial demonstrates how to start using Atlas Vector Search with Semantic Kernel to perform semantic search on your data and build a RAG implementation. Specifically, you perform the following actions:
Set up the environment.
Store custom data on Atlas.
Create an Atlas Vector Search index on your data.
Run a semantic search query on your data.
Implement RAG by using Atlas Vector Search to answer questions on your data.
Background
Semantic Kernel is an open-source SDK that allows you to combine various AI services and plugins with your applications. You can use Semantic Kernel for a variety of AI use cases, including RAG.
By integrating Atlas Vector Search with Semantic Kernel, you can use Atlas as a vector database and use Atlas Vector Search to implement RAG by retrieving semantically similar documents from your data. To learn more about RAG, see Retrieval-Augmented Generation (RAG) with Atlas Vector Search.
Prerequisites
To complete this tutorial, you must have the following:
An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later (including RCs). Ensure that your IP address is included in your Atlas project's access list.
An OpenAI API Key. You must have a paid OpenAI account with credits available for API requests.
An environment to run interactive Python notebooks such as Colab.
Set Up the Environment
You must first set up the environment for this tutorial.
Create an interactive Python notebook by saving a file
with the .ipynb
extension, and then run the
following code snippets in the notebook.
Install and import dependencies.
Run the following command in your notebook to install the semantic kernel in your environment.
pip install --quiet --upgrade semantic-kernel openai motor Run the following code to import the required packages:
import getpass, openai import semantic_kernel as sk from semantic_kernel.connectors.ai.open_ai import (OpenAIChatCompletion, OpenAITextEmbedding) from semantic_kernel.connectors.memory.mongodb_atlas import MongoDBAtlasMemoryStore from semantic_kernel.core_plugins.text_memory_plugin import TextMemoryPlugin from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory from semantic_kernel.prompt_template.input_variable import InputVariable from semantic_kernel.prompt_template.prompt_template_config import PromptTemplateConfig
Define environmental variables.
Run the following code and provide the following when prompted:
Your OpenAI API Key.
Your Atlas cluster's SRV connection string.
OPENAI_API_KEY = getpass.getpass("OpenAI API Key:") ATLAS_CONNECTION_STRING = getpass.getpass("MongoDB Atlas SRV Connection String:")
Note
Your connection string should use the following format:
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
Store Custom Data in Atlas
In this section, you initialize the kernel, which is the main interface used to manage your application's services and plugins. Through the kernel, you configure your AI services, instantiate Atlas as a vector database (also called a memory store), and load custom data into your Atlas cluster.
To store custom data in Atlas, paste and run the following code snippets in your notebook:
Add the AI services to the kernel.
Run the following code to configure the OpenAI embedding model and chat model used in this tutorial and add these services to the kernel. This code specifies the following:
OpenAI's
text-embedding-ada-002
as the embedding model used to convert text into vector embeddings.OpenAI's
gpt-3.5-turbo
as the chat model used to generate responses.
chat_service = OpenAIChatCompletion( service_id="chat", ai_model_id="gpt-3.5-turbo", api_key=OPENAI_API_KEY ) embedding_service = OpenAITextEmbedding( ai_model_id="text-embedding-ada-002", api_key=OPENAI_API_KEY ) kernel.add_service(chat_service) kernel.add_service(embedding_service)
Instantiate Atlas as a memory store.
Run the following code to instantiate Atlas as a memory store and add it to the kernel. This code establishes a connection to your Atlas cluster and specifies the following:
semantic_kernel_db
as the Atlas database used to store the documents.vector_index
as the index used to run semantic search queries.
It also imports a plugin
called TextMemoryPlugin
, which provides a group of native functions
to help you store and retrieve text in memory.
mongodb_atlas_memory_store = MongoDBAtlasMemoryStore( connection_string=ATLAS_CONNECTION_STRING, database_name="semantic_kernel_db", index_name="vector_index" ) memory = SemanticTextMemory( storage=mongodb_atlas_memory_store, embeddings_generator=embedding_service ) kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPlugin")
Load sample data on your Atlas cluster.
This code defines and runs a function to populate the semantic_kernel_db.test
collection with some sample documents. These documents
contain personalized data that the LLM did not originally have access to.
async def populate_memory(kernel: sk.Kernel) -> None: await memory.save_information( collection="test", id="1", text="I am a developer" ) await memory.save_information( collection="test", id="2", text="I started using MongoDB two years ago" ) await memory.save_information( collection="test", id="3", text="I'm using MongoDB Vector Search with Semantic Kernel to implement RAG" ) await memory.save_information( collection="test", id="4", text="I like coffee" ) print("Populating memory...") await populate_memory(kernel) print(kernel)
Populating memory... plugins=KernelPluginCollection(plugins={'TextMemoryPlugin': KernelPlugin(name='TextMemoryPlugin', description=None, functions={'recall': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='recall', plugin_name='TextMemoryPlugin', description='Recall a fact from the long term memory', parameters=[KernelParameterMetadata(name='ask', description='The information to retrieve', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='collection', description='The collection to search for information.', default_value='generic', type_='str', is_required=False, type_object=<class 'str'>), KernelParameterMetadata(name='relevance', description='The relevance score, from 0.0 to 1.0; 1.0 means perfect match', default_value=0.75, type_='float', is_required=False, type_object=<class 'float'>), KernelParameterMetadata(name='limit', description='The maximum number of relevant memories to recall.', default_value=1, type_='int', is_required=False, type_object=<class 'int'>)], is_prompt=False, is_asynchronous=True, return_parameter=KernelParameterMetadata(name='return', description='', default_value=None, type_='str', is_required=True, type_object=None)), method=<bound method TextMemoryPlugin.recall of TextMemoryPlugin(memory=SemanticTextMemory())>, stream_method=None), 'save': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='save', plugin_name='TextMemoryPlugin', description='Save information to semantic memory', parameters=[KernelParameterMetadata(name='text', description='The information to save.', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='key', description='The unique key to associate with the information.', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='collection', description='The collection to save the information.', default_value='generic', type_='str', is_required=False, type_object=<class 'str'>)], is_prompt=False, is_asynchronous=True, return_parameter=KernelParameterMetadata(name='return', description='', default_value=None, type_='', is_required=True, type_object=None)), method=<bound method TextMemoryPlugin.save of TextMemoryPlugin(memory=SemanticTextMemory())>, stream_method=None)})}) services={'chat': OpenAIChatCompletion(ai_model_id='gpt-3.5-turbo', service_id='chat', client=<openai.AsyncOpenAI object at 0x7999971c8fa0>, ai_model_type=<OpenAIModelTypes.CHAT: 'chat'>, prompt_tokens=0, completion_tokens=0, total_tokens=0), 'text-embedding-ada-002': OpenAITextEmbedding(ai_model_id='text-embedding-ada-002', service_id='text-embedding-ada-002', client=<openai.AsyncOpenAI object at 0x7999971c8fd0>, ai_model_type=<OpenAIModelTypes.EMBEDDING: 'embedding'>, prompt_tokens=32, completion_tokens=0, total_tokens=32)} ai_service_selector=<semantic_kernel.services.ai_service_selector.AIServiceSelector object at 0x7999971cad70> retry_mechanism=PassThroughWithoutRetry() function_invoking_handlers={} function_invoked_handlers={}
Tip
After running the sample code, you can
view your vector embeddings in the Atlas UI
by navigating to the semantic_kernel_db.test
collection in your cluster.
Create the Atlas Vector Search Index
To enable vector search queries on your vector store,
create an Atlas Vector Search index on the semantic_kernel_db.test
collection.
Required Access
To create an Atlas Vector Search index, you must have Project Data Access Admin
or higher access to the Atlas project.
Procedure
In Atlas, go to the Clusters page for your project.
If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
If it's not already displayed, select your desired project from the Projects menu in the navigation bar.
If the Clusters page is not already displayed, click Database in the sidebar.
The Clusters page displays.
Go to the Atlas Search page for your cluster.
You can go the Atlas Search page from the sidebar, the Data Explorer, or your cluster details page.
In the sidebar, click Atlas Search under the Services heading.
From the Select data source dropdown, select your cluster and click Go to Atlas Search.
The Atlas Search page displays.
Click the Browse Collections button for your cluster.
Expand the database and select the collection.
Click the Search Indexes tab for the collection.
The Atlas Search page displays.
Click the cluster's name.
Click the Atlas Search tab.
The Atlas Search page displays.
Define the Atlas Vector Search index.
Click Create Search Index.
Under Atlas Vector Search, select JSON Editor and then click Next.
In the Database and Collection section, find the
semantic_kernel_db
database, and select thetest
collection.In the Index Name field, enter
vector_index
.Replace the default definition with the following index definition and then click Next.
This index definition specifies indexing the following fields in an index of the vectorSearch type:
embedding
field as the vector type. Theembedding
field contains the embeddings created using OpenAI'stext-embedding-ada-002
embedding model. The index definition specifies1536
vector dimensions and measures similarity usingcosine
.
1 { 2 "fields": [ 3 { 4 "type": "vector", 5 "path": "embedding", 6 "numDimensions": 1536, 7 "similarity": "cosine" 8 } 9 ] 10 }
Run Vector Search Queries
Once Atlas builds your index, you can run vector search queries on your data.
In your notebook, run the following code to perform a basic semantic
search for the string What is my job title?
. It prints the most
relevant document and a relevance score between
0
and 1
.
result = await memory.search("test", "What is my job title?") print(f"Retrieved document: {result[0].text}, {result[0].relevance}")
Retrieved document: I am a developer, 0.8991971015930176
Answer Questions on Your Data
This section shows an example RAG implementation with Atlas Vector Search and Semantic Kernel. Now that you've used Atlas Vector Search to retrieve semantically similar documents, run the following code example to prompt the LLM to answer questions based on those documents.
The following code defines a prompt
to instruct the LLM to use the retrieved document as context for your query.
In this example, you prompt the LLM with the sample query
When did I start using MongoDB?
. Because you augmented
the knowledge base of the LLM with custom data,
the chat model is able to generate a more accurate,
context-aware response.
service_id = "chat" settings = kernel.get_service(service_id).instantiate_prompt_execution_settings( service_id=service_id ) prompt_template = """ Answer the following question based on the given context. Question: {{$input}} Context: {{$context}} """ chat_prompt_template_config = PromptTemplateConfig( execution_settings=settings, input_variables=[ InputVariable(name="input"), InputVariable(name="context") ], template=prompt_template ) prompt = kernel.add_function( function_name="RAG", plugin_name="TextMemoryPlugin", prompt_template_config=chat_prompt_template_config, ) question = "When did I start using MongoDB?" results = await memory.search("test", question) retrieved_document = results[0].text answer = await prompt.invoke( kernel=kernel, input=question, context=retrieved_document ) print(answer)
You started using MongoDB two years ago.
Next Steps
MongoDB also provides the following developer resources: