Get Started with the Semantic Kernel Python Integration

On this page

Background

Prerequisites
Set Up the Environment
Store Custom Data in Atlas
Create the Atlas Vector Search Index
Run Vector Search Queries
Answer Questions on Your Data
Next Steps

Note

This tutorial uses the Semantic Kernel Python library. For a tutorial that uses the C# library, see Get Started with the Semantic Kernel C# Integration.

You can integrate Atlas Vector Search with Microsoft Semantic Kernel to build AI applications and implement retrieval-augmented generation (RAG). This tutorial demonstrates how to start using Atlas Vector Search with Semantic Kernel to perform semantic search on your data and build a RAG implementation. Specifically, you perform the following actions:

Set up the environment.
Store custom data on Atlas.
Create an Atlas Vector Search index on your data.
Run a semantic search query on your data.
Implement RAG by using Atlas Vector Search to answer questions on your data.

Tip

Work with a runnable version of this tutorial as a Python notebook.

Background

Semantic Kernel is an open-source SDK that allows you to combine various AI services and plugins with your applications. You can use Semantic Kernel for a variety of AI use cases, including RAG.

By integrating Atlas Vector Search with Semantic Kernel, you can use Atlas as a vector database and use Atlas Vector Search to implement RAG by retrieving semantically similar documents from your data. To learn more about RAG, see Retrieval-Augmented Generation (RAG) with Atlas Vector Search.

Prerequisites

To complete this tutorial, you must have the following:

An Atlas account with a cluster running MongoDB version 6.0.11, 7.0.2, or later (including RCs). Ensure that your IP address is included in your Atlas project's access list. To learn more, see Create a Cluster.
An OpenAI API Key. You must have an OpenAI account with credits available for API requests. To learn more about registering an OpenAI account, see the OpenAI API website.
An environment to run interactive Python notebooks such as Colab.

Set Up the Environment

Set up the environment for this tutorial. Create an interactive Python notebook by saving a file with the .ipynb extension. This notebook allows you to run Python code snippets individually, and you'll use it to run the code in this tutorial.

To set up your notebook environment:

Install and import dependencies.

Run the following command in your notebook to install the semantic kernel in your environment.
```
pip install --quiet --upgrade semantic-kernel openai motor
```

Run the following code to import the required packages:

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import (OpenAIChatCompletion, OpenAITextEmbedding)
from semantic_kernel.connectors.memory.mongodb_atlas import MongoDBAtlasMemoryStore
from semantic_kernel.core_plugins.text_memory_plugin import TextMemoryPlugin
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.prompt_template.input_variable import InputVariable
from semantic_kernel.prompt_template.prompt_template_config import PromptTemplateConfig
from pymongo import MongoClient
from pymongo.operations import SearchIndexModel

Define environmental variables.

Run the following code, replacing the placeholders with the following values:

Your OpenAI API Key.
Your Atlas cluster's SRV connection string.

Note

Your connection string should use the following format:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

Store Custom Data in Atlas

In this section, you initialize the kernel, which is the main interface used to manage your application's services and plugins. Through the kernel, you configure your AI services, instantiate Atlas as a vector database (also called a memory store), and load custom data into your Atlas cluster.

To store custom data in Atlas, paste and run the following code snippets in your notebook:

Initialize the kernel.

Run the following code to initialize the kernel.

kernel = sk.Kernel()

Add the AI services to the kernel.

Run the following code to configure the OpenAI embedding model and chat model used in this tutorial and add these services to the kernel. This code specifies the following:

OpenAI's text-embedding-ada-002 as the embedding model used to convert text into vector embeddings.
OpenAI's gpt-3.5-turbo as the chat model used to generate responses.

chat_service = OpenAIChatCompletion(
   service_id="chat",
   ai_model_id="gpt-3.5-turbo",
   api_key=OPENAI_API_KEY
)
embedding_service = OpenAITextEmbedding(
   ai_model_id="text-embedding-ada-002",
   api_key=OPENAI_API_KEY
)
kernel.add_service(chat_service)
kernel.add_service(embedding_service)

Instantiate Atlas as a memory store.

Run the following code to instantiate Atlas as a memory store and add it to the kernel. This code establishes a connection to your Atlas cluster and specifies the following:

semantic_kernel_db as the Atlas database used to store the documents.
vector_index as the index used to run semantic search queries.

It also imports a plugin called TextMemoryPlugin, which provides a group of native functions to help you store and retrieve text in memory.

mongodb_atlas_memory_store = MongoDBAtlasMemoryStore(
   connection_string=ATLAS_CONNECTION_STRING,
   database_name="semantic_kernel_db",
   index_name="vector_index"
)
memory = SemanticTextMemory(
   storage=mongodb_atlas_memory_store,
   embeddings_generator=embedding_service
)
kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPlugin")

Load sample data on your Atlas cluster.

This code defines and runs a function to populate the semantic_kernel_db.test collection with some sample documents. These documents contain personalized data that the LLM did not originally have access to.

 async def populate_memory(kernel: sk.Kernel) -> None:
    await memory.save_information(
       collection="test", id="1", text="I am a developer"
    )
    await memory.save_information(
       collection="test", id="2", text="I started using MongoDB two years ago"
    )
    await memory.save_information(
       collection="test", id="3", text="I'm using MongoDB Vector Search with Semantic Kernel to implement RAG"
    )
    await memory.save_information(
       collection="test", id="4", text="I like coffee"
    )
 print("Populating memory...")
 await populate_memory(kernel)
 print(kernel)

Populating memory...
plugins=KernelPluginCollection(plugins={'TextMemoryPlugin': KernelPlugin(name='TextMemoryPlugin', description=None, functions={'recall': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='recall', plugin_name='TextMemoryPlugin', description='Recall a fact from the long term memory', parameters=[KernelParameterMetadata(name='ask', description='The information to retrieve', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='collection', description='The collection to search for information.', default_value='generic', type_='str', is_required=False, type_object=<class 'str'>), KernelParameterMetadata(name='relevance', description='The relevance score, from 0.0 to 1.0; 1.0 means perfect match', default_value=0.75, type_='float', is_required=False, type_object=<class 'float'>), KernelParameterMetadata(name='limit', description='The maximum number of relevant memories to recall.', default_value=1, type_='int', is_required=False, type_object=<class 'int'>)], is_prompt=False, is_asynchronous=True, return_parameter=KernelParameterMetadata(name='return', description='', default_value=None, type_='str', is_required=True, type_object=None)), method=<bound method TextMemoryPlugin.recall of TextMemoryPlugin(memory=SemanticTextMemory())>, stream_method=None), 'save': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='save', plugin_name='TextMemoryPlugin', description='Save information to semantic memory', parameters=[KernelParameterMetadata(name='text', description='The information to save.', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='key', description='The unique key to associate with the information.', default_value=None, type_='str', is_required=True, type_object=<class 'str'>), KernelParameterMetadata(name='collection', description='The collection to save the information.', default_value='generic', type_='str', is_required=False, type_object=<class 'str'>)], is_prompt=False, is_asynchronous=True, return_parameter=KernelParameterMetadata(name='return', description='', default_value=None, type_='', is_required=True, type_object=None)), method=<bound method TextMemoryPlugin.save of TextMemoryPlugin(memory=SemanticTextMemory())>, stream_method=None)})}) services={'chat': OpenAIChatCompletion(ai_model_id='gpt-3.5-turbo', service_id='chat', client=<openai.AsyncOpenAI object at 0x7999971c8fa0>, ai_model_type=<OpenAIModelTypes.CHAT: 'chat'>, prompt_tokens=0, completion_tokens=0, total_tokens=0), 'text-embedding-ada-002': OpenAITextEmbedding(ai_model_id='text-embedding-ada-002', service_id='text-embedding-ada-002', client=<openai.AsyncOpenAI object at 0x7999971c8fd0>, ai_model_type=<OpenAIModelTypes.EMBEDDING: 'embedding'>, prompt_tokens=32, completion_tokens=0, total_tokens=32)} ai_service_selector=<semantic_kernel.services.ai_service_selector.AIServiceSelector object at 0x7999971cad70> retry_mechanism=PassThroughWithoutRetry() function_invoking_handlers={} function_invoked_handlers={}

Tip

After running the sample code, you can view your vector embeddings in the Atlas UI by navigating to the semantic_kernel_db.test collection in your cluster.

Create the Atlas Vector Search Index

Note

To create an Atlas Vector Search index, you must have Project Data Access Admin or higher access to the Atlas project.

To enable vector search queries on your vector store, run the following code in your notebook to create an Atlas Vector Search index on the semantic_kernel_db.test collection.

# Connect to your Atlas cluster and specify the collection
client = MongoClient(ATLAS_CONNECTION_STRING)
collection = client["semantic_kernel_db"]["test"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
   definition={
      "fields": [
         {
         "type": "vector",
         "path": "embedding",
         "numDimensions": 1536,
         "similarity": "cosine"
         }
      ]
   },
   name="vector_index",
   type="vectorSearch"
)
collection.create_search_index(model=search_index_model)

The index definition indexes the embedding field as the vector type. The embedding field contains the embeddings created using OpenAI's text-embedding-ada-002 embedding model. The index definition specifies 1536 vector dimensions and measures similarity using cosine.

Run Vector Search Queries

Once Atlas builds your index, you can run vector search queries on your data.

In your notebook, run the following code to perform a basic semantic search for the string What is my job title?. It prints the most relevant document and a relevance score between 0 and 1.

result = await memory.search("test", "What is my job title?")
print(f"Retrieved document: {result[0].text}, {result[0].relevance}")

Retrieved document: I am a developer, 0.8991971015930176

Answer Questions on Your Data

This section shows an example RAG implementation with Atlas Vector Search and Semantic Kernel. Now that you've used Atlas Vector Search to retrieve semantically similar documents, run the following code example to prompt the LLM to answer questions based on those documents.

The following code defines a prompt to instruct the LLM to use the retrieved document as context for your query. In this example, you prompt the LLM with the sample query When did I start using MongoDB?. Because you augmented the knowledge base of the LLM with custom data, the chat model is able to generate a more accurate, context-aware response.

service_id = "chat"
settings = kernel.get_service(service_id).instantiate_prompt_execution_settings(
   service_id=service_id
)
prompt_template = """
   Answer the following question based on the given context.
   Question: {{$input}}
   Context: {{$context}}
"""
chat_prompt_template_config = PromptTemplateConfig(
   execution_settings=settings,
   input_variables=[
       InputVariable(name="input"),
       InputVariable(name="context")
   ],
   template=prompt_template
)
prompt = kernel.add_function(
   function_name="RAG",
   plugin_name="TextMemoryPlugin",
   prompt_template_config=chat_prompt_template_config,
)
question = "When did I start using MongoDB?"
results = await memory.search("test", question)
retrieved_document = results[0].text
answer = await prompt.invoke(
   kernel=kernel, input=question, context=retrieved_document
)
print(answer)

You started using MongoDB two years ago.

Next Steps

MongoDB also provides the following developer resources:

Note

Tip

Background

Prerequisites

Set Up the Environment

Install and import dependencies.

Define environmental variables.

Note

Store Custom Data in Atlas

Initialize the kernel.

Add the AI services to the kernel.

Instantiate Atlas as a memory store.

Load sample data on your Atlas cluster.

Tip

Create the Atlas Vector Search Index

Note

Run Vector Search Queries

Answer Questions on Your Data

Next Steps

Tip

See also: