Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Learn why MongoDB was selected as a leader in the 2024 Gartner® Magic Quadrant™
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Center
chevron-right
Developer Topics
chevron-right
Products
chevron-right
Atlas
chevron-right

How to Seamlessly Use MongoDB Atlas and IBM watsonx.ai LLMs in Your GenAI Applications

Ashwin Gangadhar9 min read • Published Sep 18, 2024 • Updated Sep 18, 2024
AIVector SearchPythonAtlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
One of the challenges of e-commerce applications is to provide relevant and personalized product recommendations to customers. Traditional keyword-based search methods often fail to capture the semantic meaning and intent of the user search queries, and return results that do not meet the user’s needs. In turn, they fail to convert into a successful sale. To address this problem, RAG (retrieval-augmented generation) is used as a framework powered by MongoDB Atlas Vector Search, LangChain, and IBM watsonx.ai.
RAG is a natural language generation (NLG) technique that leverages a retriever module to fetch relevant documents from a large corpus and a generator module to produce text conditioned on the retrieved documents. Here, the RAG framework is used to power product recommendations as an extension to existing semantic search techniques.
  • RAG use cases can be easily built using the vector search capabilities of MongoDB Atlas to store and query large-scale product embeddings that represent the features and attributes of each product. Because of MongoDB’s flexible schema, these are stored right alongside the product embeddings, eliminating the complexity and latency of having to retrieve the data from separate tables or databases.
  • RAG then retrieves the most similar products to the user query based on the cosine similarity of their embeddings, and generates natural language reasons that highlight why these products are relevant and appealing to the user.
  • RAG can also enhance the user experience (UX) by handling complex and diverse search queries, such as "a cozy sweater for winter" or "a gift for my daughter who is interested in science", and provides accurate and engaging product recommendations that increase customer satisfaction and loyalty.
MongoDB Vector Search with IBM watsonx

Why MongoDB Atlas Vector Search?

To enable RAG to retrieve relevant documents, we need a vector search engine that can efficiently store and query large-scale document embeddings. MongoDB Atlas allows us to index documents based on their embeddings and perform similarity search using cosine distance, euclidean distance, or dot product. MongoDB Atlas also provides flexible schema, scalability, security, and performance features that over 45,000 organizations — from startups to enterprises and governments — rely on.
By using RAG with MongoDB Atlas Vector Search, we can enhance the user experience of product recommendations in several ways. First, we can provide more personalized and diverse recommendations that match the user's query or context. Second, we can generate more informative and engaging responses that can explain why each product is recommended and how it compares to others. Third, we can improve the relevance and accuracy of the recommendations by updating the document embeddings as MongoDB is primarily an operational data layer (ODL) that provides transaction features.

What is watsonx.ai?

Watsonx.ai is IBM’s next-generation enterprise studio for AI builders, bringing together new generative AI capabilities with traditional machine learning (ML) that span the entire AI lifecycle. With watsonx.ai, you can train, validate, tune, and deploy foundation and traditional ML models.
watsonx.ai brings forth a curated library of foundation models, including IBM-developed models, open-source models, and models sourced from third-party providers. Not all models are created equal, and the curated library provides enterprises with the optionality to select the model best suited to a particular use case, industry, domain, or even price performance. Further, IBM-developed models, such as the Granite model series, offer another level of enterprise-readiness, transparency, and indemnification for production use cases. We’ll be using Granite models in our demonstration. For the interested reader, IBM has published information about its data and training methodology for its Granite foundation models.

How to build a custom RAG-powered product discovery pipeline

For this tutorial, we will be using an e-commerce products dataset containing over 10,000 product details. We will be using the sentence-transformers/all-mpnet-base-v2 model from Hugging Face to generate the vector embeddings to store and retrieve product information. You will need a Python notebook or an IDE, a MongoDB Atlas account, and a wastonx.ai account for hands-on experience.
For convenience, the notebook to follow along and execute in your environment is available on GitHub.

Python dependencies

  • langchain: Orchestration framework
  • ibm-watson-machine-learning: For IBM LLMs
  • wget: To download knowledge base data
  • sentence-transformers: For embedding model
  • pymongo: For the MongoDB Atlas vector store

watsonx.ai dependencies

We’ll be using the watsonx.ai foundation models and Python SDK to implement our RAG pipeline in LangChain.
  1. Sign up for a free watsonx.ai trial on IBM cloud. Register and get set up.
  2. Create a watsonx.ai Project. During onboarding, a sandbox project can be quickly created for you. You can either use the sandbox project or create one; the link will work once you have registered and set up watsonx.ai. If more help is needed, you can read the documentation.
  3. Create an API key to access watsonx.ai foundation models. Follow the steps to create your API key.
  4. Install and use watsonx.ai. Also known as the IBM Watson Machine Learning SDK, watsonx.ai SDK information is available on GitHub. Like any other Python module, you can install it with a pip install. Our example notebook takes care of this for you.
We will be running all the code snippets below in a Jupyter notebook. You can choose to run these on VS Code or any other IDE of your choice.
Initialize the LLM
Initialize the watsonx URL to connect by running the below code blocks in your Jupyter notebook:
1# watsonx URL
2
3try:
4 wxa_url = os.environ["WXA_URL"]
5
6except KeyError:
7 wxa_url = getpass.getpass("Please enter your watsonx.ai URL domain (hit enter): ")
Enter the URL for accessing the watsonx URL domain. For example: https://us-south.ml.cloud.ibm.com.
To be able to access the LLM models and other AI services on watsonx, you need to initialize the API key. You init the API key by running the following code block in you Jupyter notebook:
1# watsonx API Key
2
3try:
4 wxa_api_key = os.environ["WXA_API_KEY"]
5except KeyError:
6 wxa_api_key = getpass.getpass("Please enter your watsonx.ai API key (hit enter): ")
You will be prompted when you run the above code to add the IAM API key you fetched earlier.
Each experiment can tagged or executed under specific projects. To fetch the relevant project, we can initialize the project ID by running the below code block in the Jupyter notebook:
1# watsonx Project ID
2
3try:
4 wxa_project_id = os.environ["WXA_PROJECT_ID"]
5except KeyError:
6 wxa_project_id = getpass.getpass("Please enter your watsonx.ai Project ID (hit enter): ")
You can find the project ID alongside your IAM API key in the settings panel in the watsonx.ai portal.
Language model
In the code example below, we will initialize Granite LLM from IBM and then demonstrate how to use the initialized LLM with the LangChain framework before we build our RAG.
We will use the query: "I want to introduce my daughter to science and spark her enthusiasm. What kind of gifts should I get her?"
This will help us demonstrate how the LLM and vector search work in an RAG framework at each step.
Firstly, let us initialize the LLM hosted on the watsonx cloud. To access the relevant Granite model from watsonx, you need to run the following code block to initialize and test the model with our sample query in the Jupyter notebook:
1from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes
2from ibm_watson_machine_learning.foundation_models import Model
3from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
4from ibm_watson_machine_learning.foundation_models.utils.enums import DecodingMethods
5
6parameters = {
7 GenParams.DECODING_METHOD: DecodingMethods.GREEDY,
8 GenParams.MIN_NEW_TOKENS: 1,
9 GenParams.MAX_NEW_TOKENS: 100
10}
11
12model = Model(
13 model_id=ModelTypes.GRANITE_13B_INSTRUCT,
14 params=parameters,
15 credentials={
16 "url": wxa_url,
17 "apikey": wxa_api_key
18 },
19 project_id=wxa_project_id
20)
21
22from ibm_watson_machine_learning.foundation_models.extensions.langchain import WatsonxLLM
23
24granite_llm_ibm = WatsonxLLM(model=model)
25
26# Sample query chosen in the example to evaluate the RAG use case
27query = "I want to introduce my daughter to science and spark her enthusiasm. What kind of gifts should I get her?"
28
29# Sample LLM query without RAG framework
30result = granite_llm_ibm(query)
Output:
Jupyter Notebook Output
Prior to starting this section, you should have already set up a cluster in MongoDB Atlas. If you have not created one for yourself, then you can follow the steps in the MongoDB Atlas tutorial to create an account in Atlas (the developer data platform) and a cluster with which we can store and retrieve data. It is also advised that the users spin an Atlas dedicated cluster with size M10 or higher for this tutorial.
Now, let us see how we can set up MongoDB Atlas to provide relevant information to augment our RAG framework.
Init Mongo client
We can connect to the MongoDB Atlas cluster using the connection string as detailed in the tutorial link above. To initialize the connection string, run the below code block in your Jupyter notebook:
1from pymongo import MongoClient
2
3try:
4 MONGO_CONN = os.environ["MONGO_CONN"]
5except KeyError:
6 MONGO_CONN = getpass.getpass("Please enter your MongoDB connection String (hit enter): ")
When prompted, you can enter your MongoDB Atlas connection string.
Download and load data to MongoDB Atlas
In the steps below, we demonstrate how to download the products dataset from the provided URL link and add the documents to the respective collection in MongoDB Atlas. We will also be embedding the raw product texts as vectors before adding them in MongoDB. You can do this by running the following lines of code your Jupyter notebook:
1import wget
2
3filename = './amazon-products.jsonl'
4url = "https://github.com/ashwin-gangadhar-mdb/mbd-watson-rag/raw/main/amazon-products.jsonl"
5
6if not os.path.isfile(filename):
7 wget.download(url, out=filename)
8
9# Load the documents using Langchain Document Loader
10from langchain.document_loaders import JSONLoader
11
12loader = JSONLoader(file_path=filename, jq_schema=".text",text_content=False,json_lines=True)
13docs = loader.load()
14
15# Initialize Embedding for transforming raw documents to vectors**
16from langchain.embeddings import HuggingFaceEmbeddings
17from tqdm import tqdm as notebook_tqdm
18
19embeddings = HuggingFaceEmbeddings()
20
21# Initialize MongoDB client along with Langchain connector module
22from langchain.vectorstores import MongoDBAtlasVectorSearch
23
24client = MongoClient(MONGO_CONN)
25vcol = client["amazon"]["products"]
26vectorstore = MongoDBAtlasVectorSearch(vcol, embeddings)
27
28# Load documents to collection in MongoDB**
29vectorstore.add_documents(docs)
You will be able to see the documents have been created in amazon database under the collection products.
MongoDB Atlas Products Collection
Now all the product information is added to the respective collection, we can go ahead and create a vector index by following the steps given in the Atlas Search index tutorial. You can create the search index using both the Atlas UI as well as programmatically. Let us look at the steps if we are doing this using the Atlas UI.
Create Atlas Search Index
You can select the Json Editor view to insert the vector search index config as shown in the image. Insert the following mapping to create the vector indexes.
1{
2 "mappings": {
3 "dynamic": true,
4 "fields": {
5 "embedding": [
6 {
7 "dimensions": 768,
8 "similarity": "cosine",
9 "type": "knnVector"
10 }
11 ]
12 }
13 }
14}
Atlas Search Index JSON
Once you have updated the vector search indexes in the Atlas UI, this will start an automatic process to create the vector search indexes for the collection for the first time. Also, this will automatically index any new document added to the collection. Seamlessly, you will be able to update the collection in real time as you add new products and augment the RAG framework so that users are always served with the latest and greatest products you are offering.
For a description of the other fields in this configuration, you can check out our Vector Search documentation.
Sample query to vector search
We can test the vector similarity search by running the sample query with the LangChain MongoDB Atlas Vector Search connector. Run the following code in your Jupyter notebook:
1texts_sim = vectorstore.similarity_search(query, k=3)
2
3print("Number of relevant texts: " + str(len(texts_sim)))
4print("First 100 characters of relevant texts.")
5
6for i in range(len(texts_sim)):
7 print("Text " + str(i) + ": " + str(texts_sim[i].page_content[0:100]))
Sample Vector Search Query Output
In the above example code, we are able to use our sample text query to retrieve three relevant products. Further in the tutorial, let’s see how we can combine the capabilities of LLMs and vector search to build a RAG framework. For further information on various operations you can perform with the MongoDBAtlasVectorSearch module in LangChain, you can visit the Atlas Vector Search documentation.

RAG chain

In the code snippets below, we demonstrate how to initialize and query the RAG chain. We also introduce methods to improve the output from RAG so you can customize your output to cater to specific needs, such as the reason behind the product recommendation, language translation, summarization, etc.
So, you can set up the RAG chain and execute to get the response for our sample query by running the following lines of code in your Jupyter notebook:
1from langchain.chains import RetrievalQA
2from langchain.prompts import PromptTemplate
3
4prompt = PromptTemplate(template="""
5
6Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.
7
8{context}
9
10##Question:{question} \n\
11
12##Top 3 recommnedations from Products and reason:\n""",input_variables=["context","question"])
13
14chain_type_kwargs = {"prompt": prompt}
15retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={'k': 6, 'lambda_mult': 0.25})
16
17qa = RetrievalQA.from_chain_type(llm=granite_llm_ibm, chain_type="stuff",
18 retriever=retriever,
19 chain_type_kwargs=chain_type_kwargs)
20
21res = qa.run(query)
22
23print(f"{'-'*50}")
24print("Query: " + query)
25print(f"Response:\n{res}\n{'-'*50}\n")
The output will look like this:
Sample RAG Chain Output
You can see from the example output where the RAG is able to recommend products based on the query as well as provide a reasoning or explanation as to how this product suggestion is relevant to the query, thereby enhancing the user experience.

Conclusion

In this tutorial, we demonstrated how to use watsonx LLMs along with Atlas Vector Search to build a RAG framework. We also demonstrated how to efficiently use the RAG framework to customize your application needs, such as the reasoning for product suggestions. By following the steps in the article, we were also able to bring the power of machine learning models to a private knowledge base that is stored in the Atlas Developer Data Platform.
In summary, RAG is a powerful NLG technique that can generate product recommendations as an extension to semantic search using vector search capabilities provided by MongoDB Atlas. RAG can also improve the UX of product recommendations by providing more personalized, diverse, informative, and engaging descriptions.

Next steps

Explore more details on how you can build generative AI applications using various assisted technologies and MongoDB Atlas Vector Search.
To learn more about Atlas Vector Search, visit the product page or the documentation for creating a vector search index or running vector search queries.
To learn more about watsonx, visit the IBM watsonx page.
Top Comments in Forums
Forum Commenter Avatar
Jack_WoehrJack Woehr2 quarters ago

Now you’re talking my language!

See More on Forums

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Build a RAG System With LlamaIndex, OpenAI, and MongoDB Vector Database


Feb 16, 2024 | 10 min read
Tutorial

How to Implement Working Memory in AI Agents and Agentic Systems for Real-time AI Applications


Nov 18, 2024 | 13 min read
Article

Atlas Online Archive: Efficiently Manage the Data Lifecycle


Jan 13, 2025 | 8 min read
Tutorial

Creating a Memory-enhanced AI Agent With Anthropic's Claude in Amazon Bedrock With MongoDB Atlas


Oct 29, 2024 | 8 min read
Table of Contents