Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Learn why MongoDB was selected as a leader in the 2024 Gartner® Magic Quadrant™
MongoDB Developer
C#
plus
Sign in to follow topics
MongoDB Developer Center
chevron-right
Developer Topics
chevron-right
Languages
chevron-right
C#
chevron-right

Getting Started with Microsoft's Semantic Kernel in C# and MongoDB Atlas

Luce Carter10 min read • Published Aug 05, 2024 • Updated Oct 10, 2024
AI.NETAzureC#
FULL APPLICATION
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Semantic Kernel has become hugely popular within the Microsoft ecosystem. In fact, at Microsoft Build, Semantic Kernel and AI with MongoDB was the most discussed topic at our booth.
Semantic Kernel is Microsoft’s AI SDK available in Java, Python, and C#. It allows you to build powerful AI applications by chaining together out-of-the-box, community-created, custom plugins. These plugins work together to create plans that allow you to achieve complex tasks. This could be anything from tidying up Scott Hanselman’s desktop to summarizing a block of text and emailing you the summary. The possibilities are endless!
Semantic Kernel is a tool for building retrieval-augmented generation (RAG) apps. The R and A parts come from retrieving information to use as context in the input to the large language model (LLM). This is where MongoDB comes in. MongoDB is an option for storing data, including embeddings representing that data, and even gives you the ability to search the data using Atlas Vector Search.
Semantic Kernel has support for MongoDB Atlas thanks to a connector. So not only can you store your data in MongoDB, including the embeddings, but it also automatically uses Vector Search under the hood to retrieve the results. You get the best of Semantic Kernel and the best of MongoDB, the most popular document database for C# developers!
In this tutorial, you will learn how to get started with Semantic Kernel and MongoDB, taking advantage of the connector and the SemanticTextMemory plugin, to create a bot that will recommend a movie to watch, using OpenAI to create embeddings, and searching the sample movie data in our sample dataset.
Prefer to learn via video content? Then watch the video version available on YouTube!

Prerequisites

To follow along with this tutorial, you will need a few things in place:
  • A MongoDB M0 cluster
  • The sample data loaded into that cluster
  • A free OpenAI account and project API key
  • .NET 8 or higher
  • An IDE or text editor to follow along
If you would prefer to simply read the code, you can find it on GitHub. It has two branches, depending on whether you have access to Azure OpenAI or want to use OpenAI. We will be using OpenAI for this tutorial as it is free and open to all at time of writing.

Creating the project

Now you have the prerequisites in place, it is time to create the project and add the NuGet packages you will need to create the bot.
  1. Create a new console project, either using your IDE or via the DotNet CLI.
  2. Add the following NuGet packages to your new project
  • Microsoft.SemanticKernel
  • Microsoft.SemanticKernel.Connectors.MongoDB (N.B. This is in prerelease)
  • Microsoft.SemanticKernel.Connectors.OpenAI

Setting up our configuration

There are a few variables we are going to need throughout this tutorial so we will start by setting them up in Program.cs.
Because we want to create at least one other method in this tutorial, we will also switch to the traditional structure of our program class. Replace the contents with the following:
1using Microsoft.SemanticKernel.Memory;
2
3#pragma warning disable SKEXP0001, SKEXP0010, SKEXP0020, SKEXP0050
4public static class Program {
5 static string TextEmbeddingModelName = "text-embedding-ada-002";
6 static string OpenAIAPIKey = "<YOUR OPENAI PROJECT API KEY>";
7
8 static string MongoDBAtlasConnectionString = "<YOUR ATLAS CONNECTION STRING>";
9 static string SearchIndexName = "default";
10 static string DatabaseName = "semantic-kernel";
11 static string CollectionName = "movies";
12 static MemoryBuilder memoryBuilder;
13
14 public static async Task Main(string[] args) {
15
16 }
17}
The pragma warning disable addition is because a lot of the features are experimental and this will turn off the errors.
Go ahead and replace the placeholders for OpenAI and Atlas with your own values.

Setting up the memory plugin and memory store

You may have noticed in the last section that you added a MemoryBuilder variable. This builder is what gives you access to the memory plugin, an out-of-the-box plugin for working with stored data.
So now we are going to configure this plugin, use this builder, and also connect it to MongoDB Atlas as our memory store.
Paste the following code inside your Main method:
1memoryBuilder = new MemoryBuilder();
2
3 memoryBuilder.WithOpenAITextEmbeddingGeneration(
4 TextEmbeddingModelName,
5 OpenAIAPIKey
6 );
The Memory Builder comes with some helper methods. In this case, we are using WithOpenAITextEmbeddingGeneration which helps you configure the memory plugin.
Because we are working with text in this project, we need to be able to generate text embeddings for our data to be used in the search. This is where OpenAI comes in. By passing this method the name of the model we want to use and the OpenAI API key, the plugin has all it needs to automatically take care of the rest for us under the hood — excellent!
Ensure the following using statements are present in the file:
1using Microsoft.SemanticKernel;
2using Microsoft.SemanticKernel.Connectors.MongoDB;
3using Microsoft.SemanticKernel.Connectors.OpenAI;
4using Microsoft.SemanticKernel.Memory;
5using MongoDB.Driver;
6using Kernel = Microsoft.SemanticKernel.Kernel;
Using a database that supports vectors and vector searches, such as MongoDB Atlas, is a key part of adding the retrieval and augmentation parts to your RAG applications.
Semantic Kernel’s MongoDB Connector adds support for not only using MongoDB as your data store for your embeddings, but it also uses MongoDB’s vector search capabilities to carry out the search.
Paste the following code after the previous, inside your Main method:
1var mongoDBMemoryStore = new MongoDBMemoryStore(MongoDBAtlasConnectionString, DatabaseName, SearchIndexName);
2 memoryBuilder.WithMemoryStore(mongoDBMemoryStore);
3 var memory = memoryBuilder.Build();
Just like that, with a few lines of code, we have the memory plugin set up and it is configured to use MongoDB.

Adding documents to our memory store

MongoDB’s sample data comes with different databases and collections for a variety of use cases. One of the recent changes was to the sample_mflix database. This database has been around in the sample data for a long time but we recently added a new collection inside the database called embedded_movies. You may have noticed that already if you have browsed your new cluster. This collection contains vector embeddings on the plot field from a large number of documents from the movies collection and makes it much easier for developers to experience MongoDB’s Atlas Vector Search in a variety of programming languages.
In an ideal world, we would use this collection with Semantic Kernel. Unfortunately, there is a limitation with Semantic Kernel on the name of the field containing the embeddings value as well as the shape of the documents it can use. So for this reason, for the sake of this tutorial, we are going to import some documents from our sample_mflix database and save them in a new collection, using Semantic Kernel. This will generate the embeddings automatically using OpenAI, and save them in the format that Semantic Kernel can use later.
First, we need to create a model that represents the movie document. So create a new Movie.cs class in your project and paste in the following:
1public class Movie
2{
3 [BsonId]
4 [BsonRepresentation(BsonType.ObjectId)]
5 public string Id { get; set; }
6
7 [BsonElement("plot")]
8 public string Plot { get; set; }
9
10 [BsonElement("genres")]
11 public List<string> Genres { get; set; }
12
13 [BsonElement("runtime")]
14 public int Runtime { get; set; }
15
16 [BsonElement("cast")]
17 public List<string> Cast { get; set; }
18
19 [BsonElement("num_mflix_comments")]
20 public int NumMflixComments { get; set; }
21
22 [BsonElement("poster")]
23 public string Poster { get; set; }
24
25 [BsonElement("title")]
26 public string Title { get; set; }
27
28 [BsonElement("fullplot")]
29 public string Fullplot { get; set; }
30
31 [BsonElement("languages")]
32 public List<string> Languages { get; set; }
33
34 [BsonElement("released")]
35 public DateTime Released { get; set; }
36
37 [BsonElement("directors")]
38 public List<string> Directors { get; set; }
39
40 [BsonElement("writers")]
41 public List<string> Writers { get; set; }
42
43 [BsonElement("awards")]
44 public Awards Awards { get; set; }
45
46 [BsonElement("rated")]
47 public string? Rated { get; set; }
48
49 [BsonElement("lastupdated")]
50 public string Lastupdated { get; set; }
51
52
53 [BsonElement("year")]
54 public object Year { get; set; }
55
56 [BsonElement("imdb")]
57 public Imdb Imdb { get; set; }
58
59 [BsonElement("countries")]
60 public List<string> Countries { get; set; }
61
62 [BsonElement("type")]
63 public string Type { get; set; }
64
65 [BsonElement("tomatoes")]
66 public Tomatoes Tomatoes { get; set; }
67
68 [BsonElement("metacritic")]
69 public int? Metacritic { get; set; }
70
71 [BsonElement("awesome")]
72 public bool? Awesome { get; set; }
73}
74
75public class Awards
76{
77 [BsonElement("wins")]
78 public int Wins { get; set; }
79
80 [BsonElement("nominations")]
81 public int Nominations { get; set; }
82
83 [BsonElement("text")]
84 public string Text { get; set; }
85}
86
87public class Imdb
88{
89 [BsonElement("id")]
90 public object ImdbId { get; set; }
91
92 [BsonElement("votes")]
93 public object Votes { get; set; }
94
95 [BsonElement("rating")]
96 public object Rating { get; set; }
97}
98
99public class Tomatoes
100{
101 [BsonElement("viewer")]
102 public Viewer Viewer { get; set; }
103
104 [BsonElement("lastUpdated")]
105 public DateTime LastUpdated { get; set; }
106
107 [BsonElement("dvd")]
108 public DateTime? DVD { get; set; }
109
110 [BsonElement("website")]
111 public string? Website { get; set; }
112
113 [BsonElement("production")]
114 public string? Production { get; set; }
115
116 [BsonElement("critic")]
117 public Critic? Critic { get; set; }
118
119 [BsonElement("rotten")]
120 public int? Rotten { get; set; }
121
122 [BsonElement("fresh")]
123 public int? Fresh { get; set; }
124
125 [BsonElement("boxOffice")]
126 public string? BoxOffice { get; set; }
127
128 [BsonElement("consensus")]
129 public string? Consensus { get; set; }
130
131}
132
133public class Viewer
134{
135 [BsonElement("rating")]
136 public double Rating { get; set; }
137
138 [BsonElement("numReviews")]
139 public int NumReviews { get; set; }
140
141 [BsonElement("meter")]
142 public int Meter { get; set; }
143}
144
145public class Critic
146{
147 [BsonElement("rating")]
148 public double Rating { get; set; }
149
150 [BsonElement("numReviews")]
151 public int NumReviews { get; set; }
152
153 [BsonElement("meter")]
154 public int Meter { get; set; }
155}
If your IDE or text editor doesn’t auto add the required using statements, add the following at the top of the class:
1using MongoDB.Bson;
2using MongoDB.Bson.Serialization.Attributes;
Now we have the model available that reflects our document, it is time to make use of it.
Paste the following code in your Program.cs class:
1private static async Task FetchAndSaveMovieDocuments(ISemanticTextMemory memory, int limitSize)
2 {
3 MongoClient mongoClient = new MongoClient(MongoDBAtlasConnectionString);
4 var movieDB = mongoClient.GetDatabase("sample_mflix");
5 var movieCollection = movieDB.GetCollection<Movie>("movies");
6 List<Movie> movieDocuments;
7
8 Console.WriteLine("Fetching documents from MongoDB...");
9
10 movieDocuments = movieCollection.Find(m => true).Limit(limitSize).ToList();
11
12 movieDocuments.ForEach(movie =>
13 {
14 if (movie.Plot == null)
15 {
16 movie.Plot = "UNKNOWN";
17 }
18 });
19
20 foreach (var movie in movieDocuments)
21 {
22 try
23 {
24 await memory.SaveReferenceAsync(
25 collection: CollectionName,
26 description: movie.Plot,
27 text: movie.Plot,
28 externalId: movie.Title,
29 externalSourceName: "Sample_Mflix_Movies",
30 additionalMetadata: movie.Year.ToString());
31 }
32 catch (Exception ex)
33 {
34 Console.WriteLine(ex.Message);
35
36 }
37 }
38 }
Let’s take a look at what is happening:
  • We take advantage of the MongoDB C# driver, which is available to us from the connector, to create a new client and point it to our existing database and collection.
  • Then, we create a new list of movies, fetching the requested number of documents and adding them to the list.
  • For each movie, we do some data hygiene for any null plots as this can cause errors later, and simply marking it as nullable won’t work, sadly.
  • After we have a clean list of movies, we iterate through each one and save it to our new collection via the memory store.
    • The document that Semantic Kernel creates with the plugin has some fields that we want to populate so we assign those the most sensible values from the fields available in our movie document.
Now, we need to actually call this method. We can do this by simply calling await FetchAndSaveMovieDocuments(memory, 1500); from our Main method, after the existing code. This will populate our collection linked to the memory store with 1500 documents. You can choose a different number, if you wish.
Run the application to populate our new database and collection with data using Semantic Kernel. Once it displays “Fetching documents from MongoDB…”, wait a few minutes for it to populate in the background and then close the application. Generating the text embeddings on such a large number of documents using Semantic Kernel can take a little while. This is not a bottleneck due to the wonderful MongoDB C# driver.
dotnet run
This only needs to run once so we have some data available to us. So if you want to run this app again in future, it is OK to comment out the call to the method FetchAndSaveMovieDocuments, or remove it completely.
This will create a new database in your cluster called semantic-kernel with a collection called embedded_movies, containing the data as populated using Semantic Kernel.
Example document showing the metadata object and embedding field generated by Semantic Kernel

Creating the vector search index

You may have noticed earlier that when we added our MongoDB memory store, we passed it the search index name. This search index is used to identify which field or fields we want to use in our search. But this doesn’t exist yet on our MongoDB database.
Now you have run the application once, the data will be available in the collection to use in the search index.
We already have some great documentation on how to create a vector search index so you can refer to that on how to access the wizard in the Atlas UI to create the new index.
The following JSON can be used to define the index:
1{
2 "fields": [
3 {
4 "numDimensions": 1536,
5 "path": "embedding",
6 "similarity": "dotProduct",
7 "type": "vector"
8 }
9 ]
10}
This uses the embedding field that was generated by Semantic Kernel. OpenAI’s “text-embedding-ada-002” model that we are using for the text embedding generates 1536 dimensions. You will see this in the documents generated as the embedding array contains 1536 elements.
You will need to use the index name “default” to match the hard coded variable in your code. If you name the search index something else, be sure to update the variable.

Asking questions of our data

Now that we have the data available to us and the search index created, it is time to add the ability to actually ask questions of our data.
Paste the following code inside your Main method, after the existing code:
1Console.WriteLine("Welcome to the Movie Recommendation System!");
2Console.WriteLine("Type 'x' and press Enter to exit.");
3Console.WriteLine("============================================");
4Console.WriteLine();
5
6while(true)
7{
8 Console.WriteLine("Tell me what sort of film you want to watch..");
9 Console.WriteLine();
10
11 Console.Write("> ");
12
13 var userInput = Console.ReadLine();
14
15 if(userInput.ToLower() == "x")
16 {
17 Console.WriteLine("Exiting application..");
18 break;
19 }
20
21 Console.WriteLine();
22
23 var memories = memory.SearchAsync(CollectionName, userInput, limit: 3, minRelevanceScore: 0.6);
24
25 Console.WriteLine(String.Format("{0,-20} {1,-50} {2,-10} {3,-15}", "Title", "Plot", "Year", "Relevance (0 - 1)"));
26 Console.WriteLine(new String('-', 95)); // Adjust the length based on your column widths
27
28 await foreach (var mem in memories)
29 {
30 Console.WriteLine(String.Format("{0,-20} {1,-50} {2,-10} {3,-15}",
31 mem.Metadata.Id,
32 mem.Metadata.Description.Length > 47 ? mem.Metadata.Description.Substring(0, 47) + "..." : mem.Metadata.Description, // Truncate long descriptions
33 mem.Metadata.AdditionalMetadata,
34 mem.Relevance.ToString("0.00"))); // Format relevance score to two decimal places
35 }
36}
A lot of this code is about user input and formatting the output. But let’s look at the lines of code that matter:
memory.SearchAsync is how we carry out the search. We pass it the name of where we want to search, a.k.a. the collection name, what we want to search, how many results to get back, and what score from 0 to 1 we consider a threshold for “relevant enough.” await foreach (var mem in memories) is slightly different to the foreach you might be used to. The memories variable that was assigned the result of the search is of type ```IAsyncEnumerable
so we have to perform an await foreach to iterate through it.

Trying it out

We have everything in place now to run the application and actually ask it a question. Why not try asking it for a movie about sharks or another topic you love?
The movie search bot in action

Summary

Just like that, you have created a simple movie chat recommendation bot using Semantic Kernel from Microsoft, MongoDB Atlas, and the awesome connector for MongoDB in Semantic Kernel.
If you want to learn more, I wrote a tutorial on how to use Atlas Vector Search natively in a .NET application!
You can view the full code by visiting the repo on GitHub.
There is also a main branch of this repo which uses AzureOpenAI for those of you who have access.
Why not try it out today and see what movie you might want to watch tonight?
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Using Polymorphism with MongoDB and C#


Apr 30, 2024 | 6 min read
News & Announcements

Introducing the MongoDB Analyzer for .NET


Aug 05, 2024 | 6 min read
Quickstart

MongoDB & C Sharp: CRUD Operations Tutorial


Sep 23, 2022 | 12 min read
Tutorial

Designing a Strategy to Develop a Game with Unity and MongoDB


Apr 02, 2024 | 7 min read
Table of Contents