Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
C#
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Languageschevron-right
C#chevron-right

Integrating MongoDB with TensorFlow and C#

Folasayo Samuel Olayemi8 min read • Published Sep 05, 2024 • Updated Sep 05, 2024
.NETC#
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Are you a C# newbie or guru and curious to know how Tensorflow works with MongoDB as the database? This tutorial is for you.
This process involves fetching data from MongoDB, performing data preprocessing, and building a machine-learning model using ML.NET and TensorFlow. This guide is ideal for developers interested in leveraging the power of machine learning within a .NET environment.

What do you know about MongoDB?

MongoDB is a NoSQL database, and it helps you work with large sets of shared data. MongoDB is a document-oriented database that stores data in JSON-like documents. It allows unmatched scalability and flexibility, plus all the querying and indexing that you need.

What do you know about TensorFlow?

TensorFlow is an open-source, end-to-end platform for machine learning, developed by the Google Brain team. It offers a comprehensive system for managing all components of a machine learning setup. This tutorial, however, concentrates on using a specific TensorFlow API to develop and train machine learning models. TensorFlow operates by constructing computational graphs — networks of nodes representing mathematical operations, with edges between nodes representing the multidimensional data arrays (tensors) that flow through these operations.

Use cases of TensorFlow

Here are some useful application examples of TensorFlow.
  1. Image Recognition: TensorFlow is utilized in image recognition applications to detect objects, faces, and scenes in images and videos. This functionality is essential for a variety of applications, including security systems, where it enhances surveillance by recognizing human activities and faces, and healthcare, where it assists in diagnosing diseases by analyzing medical imagery. Learn more about TensorFlow for image recognition.
  2. Natural language processing (NLP): ensorFlow's capacity to manage large datasets and intricate algorithms makes it ideal for NLP tasks. It supports applications like language translation, sentiment analysis, and chatbots, enabling machines to understand, interpret, and generate human language in a contextually meaningful way. Explore TensorFlow applications in NLP.
  3. Recommendation systems: Numerous e-commerce and streaming companies utilize TensorFlow to build recommendation systems that analyze users' past behavior to suggest products or media they might find interesting. This personalization improves the user experience and can significantly boost conversion rates for businesses. Learn about building recommendation systems with TensorFlow.
  4. Autonomous vehicles: TensorFlow is utilized in the automotive industry to develop and improve systems for autonomous vehicles. By processing data from various sensors and cameras, TensorFlow-based models support making decisions about vehicle steering and collision avoidance. Explore how TensorFlow is applied in autonomous driving.
  5. Healthcare: TensorFlow is utilized in various tasks, like disease diagnosis and drug discovery. It examines patterns from large datasets of medical records to predict disease progression and results, facilitating early diagnosis and personalized treatment plans. Discover TensorFlow applications in healthcare.
These instances illustrate the versatility of TensorFlow covering different domains, showcasing its role in driving innovation by transforming how data is interpreted and utilized in building intelligent applications. Each instance link provided offers a deeper swoop into how TensorFlow is used in real-world applications, providing evidence of its broad utility and impact.

Prerequisites

Before we dive into the details, make sure you have the following installed:

NuGet packages to install

For this project, you need to install the following NuGet packages:
  • MongoDB.Driver: This package includes everything you need to interact with MongoDB, including BSON and CRUD operations. Install with dotnet add package MongoDB.Driver.
  • Microsoft.ML: This package is essential for building and training machine learning models in .NET. Install with dotnet add package Microsoft.ML.
  • Microsoft.ML.TensorFlow: This package allows integration with TensorFlow models within ML.NET. Install with dotnet add package Microsoft.ML.TensorFlow.
Make sure MongoDB is running on your local machine. You can download and install MongoDB from the website.
Finally, set up your development environment by initializing a new C# (Console App) project. Follow Visual Studio Code’s guide if you are coding your C# console app project for the first time.

Step-by-step breakdown of the code

Define schemas and models

Before connecting to MongoDB, define the structure of your data by creating classes that represent your models. This approach ensures that when you interact with MongoDB, you're working with strongly-typed objects instead of generic BsonDocument objects. This improves code clarity, maintainability, and type safety.
1using MongoDB.Bson;
2using System.Collections.Generic;
3using Microsoft.ML.Data;
4
5 public class SampleData
6 {
7 public ObjectId Id { get; set; } // This corresponds to the MongoDB _id field
8 public List<double> X { get; set; } = new List<double>();
9 public List<double> Y { get; set; } = new List<double>();
10 }
11
12 public class DataPoint
13 {
14 public float X { get; set; }
15 public float Y { get; set; }
16 }
17
18 public class Prediction
19 {
20 [ColumnName("Score")]
21 public float PredictedY { get; set; }
22 }

Connect to MongoDB

With the model classes defined in the previous step, you can now establish a connection to MongoDB using these models. Instead of using a generic BsonDocument, specify the type of documents in the collection (SampleData), making your code more intuitive and type-safe.
Place the following code in your Program.cs file:
1// MongoDB connection string
2var client = new MongoClient("mongodb://localhost:27017");
3var database = client.GetDatabase("linear-data");
4var collection = database.GetCollection<SampleData>("sampleData");
Ensure you include the necessary using statements at the top of Program.cs:
1using MongoDB.Bson;
2using MongoDB.Driver;
3using System.Collections.Generic;
This setup allows you to interact with your MongoDB collection using strongly-typed models, improving code readability and maintainability.

Create and insert sample data

  • Sample data: First, define the sample data to be inserted into MongoDB.
  • Insert data: Then, insert the defined document into the MongoDB collection.
1// Define the data to insert
2 var sampleData = new SampleData
3 {
4 X = new List<double> { 3.3, 4.4, 5.5, 6.71, 6.93, 4.168, 9.779, 6.182, 7.59, 2.167, 7.042, 10.791, 5.313, 7.997, 5.654, 9.27, 3.1 },
5 Y = new List<double> { 1.7, 2.76, 2.09, 3.19, 1.694, 1.573, 3.366, 2.596, 2.53, 1.221, 2.827, 3.465, 1.65, 2.904, 2.42, 2.94, 1.3 }
6 };
7
8// Insert the data into MongoDB
9collection.InsertOne(sampleData);

Fetch data

The following steps describe how to retrieve and prepare data from your MongoDB database to use as a dataset for training your TensorFlow model.
1// Fetch the data from MongoDB as SampleData
2var fetchedSampleData = collection.Find(new BsonDocument()).FirstOrDefault();
3
4// Assuming you have only one document, use the fetchedSampleData directly
5var xArray = fetchedSampleData.X.Select(value => (float)value).ToArray();
6var yArray = fetchedSampleData.Y.Select(value => (float)value).ToArray();
Retrieve data: Begin by retrieving the data from MongoDB. This code fetches the first document from the collection and assumes that your MongoDB setup contains only one document that holds the entire dataset you need.
Convert BSON array: Once you've retrieved the document, you'll need to extract the data from the BSON arrays and convert them into float arrays. The Select method is used here to convert each element of the X and Y lists from double to float. This conversion is crucial for compatibility with TensorFlow, which often requires data in the form of float arrays.
These steps ensure that you're not just retrieving generic data, but specifically targeting the structured data stored in your MongoDB collection. This setup is designed to be intuitive for both beginners and experienced developers, providing a clear pathway from data retrieval to TensorFlow model training in a C# environment.

Preparing data for ML.NET

1// Create a new ML context
2 var mlContext = new MLContext();
3
4 // Create the ML.NET data structures
5 var data = xArray.Zip(yArray, (x, y) => new DataPoint { X = x, Y = y }).ToList();
6 var dataView = mlContext.Data.LoadFromEnumerable(data);
  • ML context: Create a new ML.NET context.
  • Data structures: Load the data into ML.NET's data structures

Building and training the model

1// Define the trainer
2var pipeline = mlContext.Transforms.Concatenate("Features", new[] { "X" })
3 .Append(mlContext.Transforms.NormalizeMinMax("Features"))
4 .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Y", featureColumnName: "Features"));
5
6// Train the model
7var model = pipeline.Fit(dataView);
  • Pipeline definition: Define a machine learning pipeline using ML.NET.
  • Model training: Train the model on the data.

Evaluating the model

1// Use the model to make predictions
2var predictions = model.Transform(dataView);
3var metrics = mlContext.Regression.Evaluate(predictions, labelColumnName: "Y");
4
5Console.WriteLine($"R^2: {metrics.RSquared}");
6Console.WriteLine($"RMSE: {metrics.RootMeanSquaredError}");
  • Transform data: Transform the data using the trained model.
  • Evaluate model: Evaluate the model’s performance using R² and RMSE metrics.

Making predictions

1// Display the predictions
2 var predictionFunction = mlContext.Model.CreatePredictionEngine<DataPoint, Prediction>(model);
3 foreach (var point in data)
4 {
5 var prediction = predictionFunction.Predict(point);
6 Console.WriteLine($"X: {point.X}, Y: {point.Y}, Predicted: {prediction.PredictedY}");
7 }
  • Prediction engine: Create a prediction engine.
  • Make predictions: Use the engine to make predictions and display the results.

Running the code

  1. Ensure MongoDB is running: Start MongoDB on your local machine.
  2. Run the code: Execute the program using dotnet run.
1dotnet run

Expected output

The expected output of the given C# code, which seeds data into MongoDB and then uses TensorFlow to perform linear regression, includes two main parts: a confirmation message that data has been seeded successfully and the evaluation metrics of the linear regression model, followed by the predicted values for each data point.
Here’s a detailed breakdown of what you should expect:
Data seeding confirmation The first output message confirms that the data has been seeded into MongoDB successfully.
1Data seeded successfully!
Model evaluation metrics The output includes evaluation metrics for the linear regression model. These metrics help to understand the performance of the model.
R-squared (R^2) This value measures the proportion of variance in the dependent variable that is predictable from the independent variable(s). In this case, a negative R^2 value of -85.09826582520343 indicates that the model is not fitting the data well.
1R^2: -131160.77737920854
Root mean squared error (RMSE) This metric measures the average magnitude of the prediction errors. A lower RMSE indicates a better fit of the model. Here, the RMSE value is 6.567497795652124.
1RMSE: 256.3340977791911
Predictions For each data point, the output shows the actual X and Y values, along with the predicted Y value from the linear regression model.
1X: 3.3, Y: 1.7, Predicted: 84.26584
2X: 4.4, Y: 2.76, Predicted: -8.049469
3X: 5.5, Y: 2.09, Predicted: -100.36478
4X: 6.71, Y: 3.19, Predicted: -201.91168
5X: 6.93, Y: 1.694, Predicted: -220.3747
6X: 4.168, Y: 1.573, Predicted: 11.420654
7X: 9.779, Y: 3.366, Predicted: -459.47144
8X: 6.182, Y: 2.596, Predicted: -157.60028
9X: 7.59, Y: 2.53, Predicted: -275.76392
10X: 2.167, Y: 1.221, Predicted: 179.35062
11X: 7.042, Y: 2.827, Predicted: -229.77405
12X: 10.791, Y: 3.465, Predicted: -544.4015
13X: 5.313, Y: 1.65, Predicted: -84.6712
14X: 7.997, Y: 2.904, Predicted: -309.9206
15X: 5.654, Y: 2.42, Predicted: -113.28891
16X: 9.27, Y: 2.94, Predicted: -416.75458
17X: 3.1, Y: 1.3, Predicted: 101.050446
These values show the actual X and Y values from the dataset along with the corresponding predicted Y values. The predictions illustrate how the linear regression model approximates the relationship between X and Y. Given the poor performance indicated by the R^2 value, the predicted values may not be close to the actual Y values.

Summary

  • Data seeding confirmation: Confirms that the data was successfully inserted into MongoDB.
  • Model evaluation metrics: Provides insight into the model's performance, indicating poor fit with a negative R^2 and a relatively high RMSE.
  • Predictions: Shows the actual and predicted values, highlighting the model's approximation of the data.

Full expected output example

1Data seeded successfully!
2R^2: -131160.77737920854
3RMSE: 256.3340977791911
4X: 3.3, Y: 1.7, Predicted: 84.26584
5X: 4.4, Y: 2.76, Predicted: -8.049469
6X: 5.5, Y: 2.09, Predicted: -100.36478
7X: 6.71, Y: 3.19, Predicted: -201.91168
8X: 6.93, Y: 1.694, Predicted: -220.3747
9X: 4.168, Y: 1.573, Predicted: 11.420654
10X: 9.779, Y: 3.366, Predicted: -459.47144
11X: 6.182, Y: 2.596, Predicted: -157.60028
12X: 7.59, Y: 2.53, Predicted: -275.76392
13X: 2.167, Y: 1.221, Predicted: 179.35062
14X: 7.042, Y: 2.827, Predicted: -229.77405
15X: 10.791, Y: 3.465, Predicted: -544.4015
16X: 5.313, Y: 1.65, Predicted: -84.6712
17X: 7.997, Y: 2.904, Predicted: -309.9206
18X: 5.654, Y: 2.42, Predicted: -113.28891
19X: 9.27, Y: 2.94, Predicted: -416.75458
20X: 3.1, Y: 1.3, Predicted: 101.050446
Console output from dotnet run showing expected output

Conclusion

By following this guide, you’ve successfully integrated MongoDB with TensorFlow and C# using ML.NET. This integration enables you to leverage MongoDB's data storage capabilities with the powerful machine learning framework TensorFlow, all within a .NET environment.
This tutorial demonstrates the ease with which different technologies can be combined to create robust data processing and machine learning solutions.

Appendix

For further learning, check out the TensorFlow and MongoDB documentation, and explore more complex machine learning models and database operations. If you have questions or want to share your work, join us in the MongoDB Developer Community.
Thanks for reading... Happy coding!
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Designing and Developing 2D Game Levels with Unity and C#


Feb 03, 2023 | 7 min read
Tutorial

Handling Complex Aggregation Pipelines With C#


Nov 15, 2024 | 5 min read
Tutorial

Sending and Requesting Data from MongoDB in a Unity Game


Sep 09, 2024 | 8 min read
Article

The C# Driver Version 3.0 is Here! What Do You Need to Know?


Nov 04, 2024 | 5 min read
Table of Contents