Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Learn why MongoDB was selected as a leader in the 2024 Gartner® Magic Quadrant™
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Accelerate Your AI journey: Simplify Gen AI RAG With MongoDB Atlas & Google’s Vertex AI Reasoning Engine

Venkatesh Shanbhag, Maruti C6 min read • Published Aug 16, 2024 • Updated Aug 16, 2024
Atlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Imagine a world of data-driven applications, demanding flexibility and power. This is where MongoDB thrives, with features perfectly aligned with these modern needs. But data alone isn't enough. Applications need intelligence too. Enter generative AI (gen AI), a powerful tool for content creation. But what if gen AI could do even more?
This is where AI agents come in. Acting as the mastermind behind gen AI, they orchestrate tasks, learn continuously, and make decisions. With agents, gen AI transforms into a versatile tool, automating tasks, personalizing interactions, and constantly improving. But how do we unleash this full potential?
Here's where the Vertex AI Reasoning Engine steps in. Reasoning Engine (LangChain on Vertex AI) is a managed service that helps you to build and deploy an agent reasoning framework. It is a platform specifically designed for intelligent gen AI applications. Reasoning Engine is a Vertex AI service that has all the benefits of Vertex AI integration: security, privacy, observability, and scalability. Easily deploy and scale your application from development to production with a straightforward API, minimizing time-to-market. As a managed service, Reasoning Engine empowers you to build and deploy agent reasoning framework. It offers flexibility in how much reasoning you delegate to the large language model (LLM) and how much you control with custom code.
Figure 1 : How it works: MongoDB as vector store for Google Reasoning engine
Figure 1 : How it works: MongoDB as vector store for Google Reasoning engine
Lets see how MongoDB Atlas and Vertex AI Reasoning Engine can help you build and deploy a new generation of intelligent applications using LangChain on Vertex AI by combining data, automation, and machine learning. Here's a breakdown of the benefits: \
  1. Powerful and flexible data management with MongoDB: MongoDB's features like data store and vector store are suited for modern data-driven applications that require flexibility and scalability.
  2. Enhanced applications with generative AI: Generative AI can create content, potentially saving time and resources.
  3. Intelligent workflows with AI agents: AI agents can manage and automate tasks behind the scenes, improving efficiency. They can learn from data and experience, constantly improving the application's performance. Agents can analyze data and make decisions, potentially leading to more intelligent application behavior.
This solution is beneficial for various industries and applications, such as customer service chatbots that can learn and personalize interactions, or e-commerce platforms that can automate product recommendations based on customer data. Let's have a deep dive into the setup.
In this post, we will cover how to build a retrieval-augmented generation (RAG) application using MongoDB and Vertex AI and deploy it on Reasoning Engine. Firstly, we will ingest data into MongoDB Atlas and create embeddings for the RAG solution. We will also cover how to use agents to call different tools in return, querying different collections on MongoDB based on the context of the natural language query from the user.

Ingest data and vectors into MongoDB using LangChain

MongoDB Atlas simplifies the process by storing your complex data (like protein sequences or user profiles and so on) alongside their corresponding vector embeddings. This allows you to leverage vector search to efficiently find similar data points, uncovering hidden patterns and relationships. Furthermore, MongoDB Atlas facilitates data exploration by enabling you to group similar data together based on their vector representations.
LangChain is an open-source toolkit that helps developers build with LLMs. Like Lego for AI, it offers pre-built components to connect the models with your data and tasks. This simplifies building creative AI applications that answer questions, generate text formats, and more.
To begin with the setup, the first step is to create a MongoDB Atlas cluster on Google Cloud. Configure IP access list entries and a database user for accessing the cluster using the connection string. We will use Google Colab to ingest, build, and deploy the RAG.
Next, import the Python notebook into your Colab enterprise, run the requirements, and ingest the block. We will import the data from Wikipedia for Star Wars and Star Trek.
LangChain streamlines text embedding generation with pre-built models like text-embedding and textembedding-gecko. These models convert your text data into vector representations, capturing semantic meaning in a high-dimensional space. This facilitates efficient information retrieval and comparison within LangChain's reasoning workflows. We are using Google's text-embedding-004 model to convert the input data into embeddings on 768 dimensions.
1def get_text_embeddings(chunks):
2from vertexai.language_models import TextEmbeddingModel
3model = TextEmbeddingModel.from_pretrained("text-embedding-004")
4inputs = chunks[0]
5embeddings = model.get_embeddings(chunks)
6return [embedding.values for embedding in embeddings]
The generated embeddings are stored in MongoDB Atlas alongside the actual data. Before executing the write_to_mongoDB function, update the URI to connect to your MongoDB cluster. Pass the db_name and coll_name for the function where you want to store the embeddings.
1def write_to_mongoDB(embeddings, chunks, db_name, coll_name):
2from pymongo import MongoClient
3client = MongoClient("URI", tlsCAFile=certifi.where())
4db = client[db_name]
5collection = db[coll_name]
6
7for i in range(len(chunks)):
8 collection.insert_one({
9 "chunk": chunks[i],
10 "embedding": embeddings[i]
11 })

Reasoning Engine

Model

The first step in building your Reasoning Engine agent is specifying the generative AI model. Here, we're using the latest "gemini-1.5-pro" LLM, which will form the foundation of the RAG component.
1model = "gemini-1.5-pro-001"

Tool creation: RAG using MongoDB Atlas with LangChain

LangChain acts as the bridge between your generative model and MongoDB Atlas, allowing it to query vectors. It takes a "query" as input, transforms it into embeddings using Google's embedding models, and retrieves the most semantically near data from MongoDB Atlas. Below is the script for a tool that generates vectors for the query string, performs vector search on MongoDB Atlas, and returns the relevant document to the LLM. Update the function name database and collection name to read from different collections. We can initialize multiple tools and pass to the agent in the next step.
1def star_wars_query_tool(
2query: str) :
3"""
4Retrieves vectors from a MongoDB database and uses them to answer a question related to Star wars.
5
6Args:
7 query: The question to be answered about star wars.
8
9Returns:
10 A dictionary containing the response to the question.
11"""
12from langchain.chains import ConversationalRetrievalChain, RetrievalQA
13from langchain_mongodb import MongoDBAtlasVectorSearch
14from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
15from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory
16from pymongo import MongoClient
17
18from langchain.prompts import PromptTemplate
19
20
21prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Do not return any answers from your own knowledge.
22
23{context}
24Question: {question}
25"""
26# create prompt for LLM
27PROMPT = PromptTemplate(
28 template=prompt_template, input_variables=["context", "question"]
29)
30
31# Add your connection string in srv format below in place of URI
32client = MongoClient("URI")
33db = client["embeddings"]
34
35embeddings =VertexAIEmbeddings(model_name="text-embedding-004")
36
37# initilize the vector store
38vs = MongoDBAtlasVectorSearch(
39 collection=db["sample_starwars_embeddings"],
40 embedding=embeddings,
41 index_name="vector_index",
42 embedding_key="embedding",
43 text_key="chunk",
44)
45
46# initilize LLM
47llm = ChatVertexAI(
48 model_name="gemini-1.5-pro",
49 convert_system_message_to_human=True,
50 max_output_tokens=1000,
51)
52
53# initilize retriver for the vector store object created
54retriever = vs.as_retriever(
55 search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}
56)
57memory = ConversationBufferWindowMemory(
58 memory_key="chat_history", k=5, return_messages=True
59)
60
61# initilize the conversation chain
62conversation_chain = ConversationalRetrievalChain.from_llm(
63 llm=llm,
64 retriever=retriever,
65 memory=memory,
66 combine_docs_chain_kwargs={"prompt": PROMPT},
67)
68
69# query and get the response from conversation chain
70response = conversation_chain({"question": query})
71
72return response

Define an agent

Vertex AI's Reasoning Engine agent goes beyond just decision-making tools, transforming LangChain agents into versatile AI assistants that can handle data, connect to systems, and make complex decisions, all while understanding and responding to text. This will let you tailor them to specific tasks like choosing the right tool for the job. Teaming up powerful language models like Gemini with reasoning agents enhances their skills by enabling them to understand and generate natural language, making them communication- and information-processing masters — a valuable addition to their toolkit.
By incorporating a reasoning layer, your agent leverages the provided tools to guide the end user toward achieving their ultimate objective. You can define multiple tools at the same time and the LLM will find out which tool to use based on the relevance to the question being asked and the description provided in the tool itself. We are using the default LangchainAgent class that can be further customized based on your requirements.
Workflow for the above use case we discussed from end to end
Figure 2: Workflow for the above use case we discussed from end to end
With the below code, we will initialize the agent for tools to perform vector search on MongoDB collections. The star_wars_query_tool will read from the sample_starwars_embeddings collection. Similarly, create a tool to read from the sample_startrek_embeddings collection. The Reasoning Engine will redirect the query to read from the Star Wars or Star Trek collection based on the reasoning and prompt set by the user while creating the tools.
1agent = reasoning_engines.LangchainAgent(
2model=model,
3tools=[star_wars_query_tool, star_trek_query_tool],
4agent_executor_kwargs={"return_intermediate_steps": True},
5)
6agent.query(input="tell me about star wars?")

Deploy on Reasoning Engine

With the model, tools, and reasoning logic defined and tested locally, it's time to deploy your agent as a remote service on Vertex AI. We have used:
1remote_agent = reasoning_engines.ReasoningEngine.create(
2agent,
3requirements=[
4 "google-cloud-aiplatform[langchain,reasoningengine]",
5 "cloudpickle==3.0.0",
6 "pydantic==2.7.4",
7 "langchain-mongodb",
8 "pymongo",
9 "langchain-google-vertexai",
10
11],
12)
The output will include the deployment details for the Reasoning Engine that can be used to implement the user application.
1INFO:vertexai.reasoning_engines._reasoning_engines:reasoning_engine = vertexai.preview.reasoning_engines.ReasoningEngine('projects/project-id/locations/us-central1/reasoningEngines/reasoning-engine-id')
2
3from vertexai.preview import reasoning_engines
4REASONING_ENGINE_RESOURCE_NAME = "projects/project-id/locations/us-central1/reasoningEngines/reasoning-engine-id"
5remote_agent = reasoning_engines.ReasoningEngine(REASONING_ENGINE_RESOURCE_NAME)
6response = remote_agent.query(input="Tell me about episode 1 from wars")
You can also debug and optimize your agents by enabling tracing in the Reasoning Engine. View the notebook that explains how you can use Cloud Trace for exploring the tracing data to get insights.
Every aspect of your agent is customizable, from core instructions and starting prompts to managing conversation history for a seamless, context-aware experience across multiple queries. Follow the instructions in the Python notebook of the GitHub repository to create your own agent. The solution in this post can be easily extended to have an agent with multiple and any kind of LangChain tools (like function calling and extensions) and to have an application with multiple agents. We will talk about the multi-agents with MongoDB and Google Cloud in detail in our follow-up articles.
Want $500 in credits for the Google Marketplace? Simply check out our program, subscribe to Atlas, and claim your credits today, and try out Atlas on the GCP marketplace for your new workload.
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Beyond Basics: Enhancing Kotlin Ktor API With Vector Search


Sep 18, 2024 | 9 min read
Tutorial

Developing Your Applications More Efficiently with MongoDB Atlas Serverless Instances


Feb 03, 2023 | 7 min read
Article

Using SuperDuperDB to Accelerate AI Development on MongoDB Atlas Vector Search


Sep 18, 2024 | 6 min read
Tutorial

Exploring Window Operators in Atlas Stream Processing


Aug 13, 2024 | 4 min read
Table of Contents