Get Started with the LangChain JS/TS Integration

On this page

Background

Prerequisites
Set Up the Environment
Use Atlas as a Vector Store
Create the Atlas Vector Search Index
Run Vector Search Queries
Answer Questions on Your Data
Next Steps

Note

This tutorial uses LangChain's JavaScript library. For a tutorial that uses the Python library, see Get Started with the LangChain Integration.

You can integrate Atlas Vector Search with LangChain to build LLM applications and implement retrieval-augmented generation (RAG). This tutorial demonstrates how to start using Atlas Vector Search with LangChain to perform semantic search on your data and build a RAG implementation. Specifically, you perform the following actions:

Set up the environment.
Store custom data on Atlas.
Create an Atlas Vector Search index on your data.
Run the following vector search queries:
- Semantic search.
- Semantic search with metadata pre-filtering.
- Maximal Marginal Relevance (MMR) search.
Implement RAG by using Atlas Vector Search to answer questions on your data.

Background

LangChain is an open-source framework that simplifies the creation of LLM applications through the use of "chains." Chains are LangChain-specific components that can be combined for a variety of AI use cases, including RAG.

By integrating Atlas Vector Search with LangChain, you can use Atlas as a vector database and use Atlas Vector Search to implement RAG by retrieving semantically similar documents from your data. To learn more about RAG, see Key Concepts.

Prerequisites

To complete this tutorial, you must have the following:

An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later (including RCs).
An OpenAI API Key. You must have a paid OpenAI account with credits available for API requests.
A terminal and code editor to run your Node.js project.
npm and Node.js installed.

Set Up the Environment

You must first set up the environment for this tutorial. To set up your environment, complete the following steps.

Initialize your Node.js project.

Run the following commands in your terminal to create a new directory named langchain-mongodb and initialize your project:

mkdir langchain-mongodb
cd langchain-mongodb
npm init -y

Install and import dependencies.

Run the following command:

npm install langchain @langchain/mongodb @langchain/openai pdf-parse fs

Update your `package.json` file.

In your project's package.json file, specify the type field as shown in the following example, and then save the file.

{
   "name": "langchain-mongodb",
   "type": "module",
   ...

Create a file named `get-started.js` and paste the following code.

In your project, create a file named get-started.js, and then copy and paste the following code into the file. You will add code to this file throughout the tutorial.

This initial code snippet imports required packages for this tutorial, defines environmental variables, and establishes a connection to your Atlas cluster.

import { formatDocumentsAsString } from "langchain/util/document";
import { MongoClient } from "mongodb";
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings, ChatOpenAI } from "@langchain/openai";
import { PDFLoader } from "langchain/document_loaders/fs/pdf";
import { PromptTemplate } from "@langchain/core/prompts";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { RunnableSequence, RunnablePassthrough } from "@langchain/core/runnables";
import { StringOutputParser } from "@langchain/core/output_parsers";
import * as fs from 'fs';
process.env.OPENAI_API_KEY = "<api-key>";
process.env.ATLAS_CONNECTION_STRING = "<connection-string>";
const client = new MongoClient(process.env.ATLAS_CONNECTION_STRING);

Replace the placeholder values.

To finish setting up the environment, replace the <api-key> and <connection-string> placeholder values in get-started.js with your OpenAI API Key and the SRV connection string for your Atlas cluster. Your connection string should use the following format:

mongodb+srv://<username>:<password>@<clusterName>.<hostname>.mongodb.net

Use Atlas as a Vector Store

In this section, you define an asynchronous function to load custom data into Atlas and instantiate Atlas as a vector database, also called a vector store. Add the following code into your get-started.js file.

Note

For this tutorial, you use a publicly accessible PDF document titled MongoDB Atlas Best Practices as the data source for your vector store. This document describes various recommendations and core concepts for managing your Atlas deployments.

This code performs the following actions:

Configures your Atlas collection by specifying the following parameters:
- langchain_db.test as the Atlas collection to store the documents.
- vector_index as the index to use for querying the vector store.
- text as the name of the field containing the raw text content.
- embedding as the name of the field containing the vector embeddings.
Prepares your custom data by doing the following:
- Retrieves raw data from the specified URL and saves it as PDF.
- Uses a text splitter to split the data into smaller documents.
- Specifies chunk parameters, which determines the number of characters in each document and the number of characters that should overlap between two consecutive documents.
Creates a vector store from the sample documents by calling the MongoDBAtlasVectorSearch.fromDocuments method. This method specifies the following parameters:
- The sample documents to store in the vector database.
- OpenAI's embedding model as the model used to convert text into vector embeddings for the embedding field.
- Your Atlas configuration.

async function run() {
  try {
    // Configure your Atlas collection
    const database = client.db("langchain_db");
    const collection = database.collection("test");
    const dbConfig = {  
      collection: collection,
      indexName: "vector_index", // The name of the Atlas search index to use.
      textKey: "text", // Field name for the raw text content. Defaults to "text".
      embeddingKey: "embedding", // Field name for the vector embeddings. Defaults to "embedding".
    };
    
    // Ensure that the collection is empty
    await collection.deleteMany({});
    // Save online PDF as a file
    const rawData = await fetch("https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE4HkJP");
    const pdfBuffer = await rawData.arrayBuffer();
    const pdfData = Buffer.from(pdfBuffer);
    fs.writeFileSync("atlas_best_practices.pdf", pdfData);
    // Load and split the sample data
    const loader = new PDFLoader(`atlas_best_practices.pdf`);
    const data = await loader.load();
    const textSplitter = new RecursiveCharacterTextSplitter({
      chunkSize: 200,
      chunkOverlap: 20,
    });
    const docs = await textSplitter.splitDocuments(data);
    // Instantiate Atlas as a vector store
    const vectorStore = await MongoDBAtlasVectorSearch.fromDocuments(docs, new OpenAIEmbeddings(), dbConfig);
  } finally {
    // Ensure that the client will close when you finish/error
    await client.close();
  }
}
run().catch(console.dir);

Save the file, then run the following command to load your data into Atlas.

node get-started.js

Tip

After running get-started.js, you can view your vector embeddings in the Atlas UI by navigating to the langchain_db.test collection in your cluster.

Create the Atlas Vector Search Index

To enable vector search queries on your vector store, create an Atlas Vector Search index on the langchain_db.test collection.

Required Access

To create an Atlas Vector Search index, you must have Project Data Access Admin or higher access to the Atlas project.

Procedure

In Atlas, go to the Clusters page for your project.

If it is not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
If it is not already displayed, select your desired project from the Projects menu in the navigation bar.
If the Clusters page is not already displayed, click Database in the sidebar.

Go to the Atlas Search page for your cluster.

Click your cluster's name.
Click the Atlas Search tab.

Define the Atlas Vector Search index.

Click Create Search Index.
Under Atlas Vector Search, select JSON Editor and then click Next.
In the Database and Collection section, find the langchain_db database, and select the test collection.
In the Index Name field, enter vector_index.

Replace the default definition with the following index definition and then click Next.

This index definition specifies indexing the following fields in an index of the vectorSearch type:

embedding field as the vector type. The embedding field contains the embeddings created using OpenAI's text-embedding-ada-002 embedding model. The index definition specifies 1536 vector dimensions and measures similarity using cosine.
loc.pageNumber field as the filter type for pre-filtering data by the page number in the PDF.

{
   "fields":[
      {
         "type": "vector",
         "path": "embedding",
         "numDimensions": 1536,
         "similarity": "cosine"
      },
      {
         "type": "filter",
         "path": "loc.pageNumber"
      }
   ]
}

Review the index definition and then click Create Search Index.

A modal window displays to let you know that your index is building.

Click Close to close the You're All Set! modal window.

In your `get-started.js` file, add the following code.

Return to the get-started.js file and add the following code to the asynchronous function that you defined. This code helps to ensure that your search index has synced to your data before it's used.

// Wait for Atlas to sync index
console.log("Waiting for initial sync...");
await new Promise(resolve => setTimeout(() => {
  resolve();
}, 10000));

Run Vector Search Queries

This section demonstrates various queries that you can run on your vectorized data. Now that you've created the index, add the following code to your asynchronous function to run vector search queries against your data.

Note

If you experience inaccurate results when querying your data, your index might be taking longer than expected to sync. Increase the number in the setTimeout function to allow more time for the initial sync.

Tip

Answer Questions on Your Data

This section demonstrates two different RAG implementations using Atlas Vector Search and LangChain. Now that you've used Atlas Vector Search to retrieve semantically similar documents, use the following code examples to prompt the LLM to answer questions against the documents returned by Atlas Vector Search.

Next Steps

MongoDB also provides the following developer resources:

Get Started with the LangChain JS/TS Integration

Note

Background

Prerequisites

Set Up the Environment

Initialize your Node.js project.

Install and import dependencies.

Update your `package.json` file.

Create a file named `get-started.js` and paste the following code.

Replace the placeholder values.

Use Atlas as a Vector Store

Note

Tip

Create the Atlas Vector Search Index

Required Access

Procedure

In Atlas, go to the Clusters page for your project.

Go to the Atlas Search page for your cluster.

Define the Atlas Vector Search index.

Review the index definition and then click Create Search Index.

Click Close to close the You're All Set! modal window.

In your `get-started.js` file, add the following code.

Run Vector Search Queries

Note

Tip

See also:

Answer Questions on Your Data

Next Steps

Tip

See also:

Note

Background

Prerequisites

Set Up the Environment

Initialize your Node.js project.

Install and import dependencies.

Update your package.json file.

Create a file named get-started.js and paste the following code.

Replace the placeholder values.

Use Atlas as a Vector Store

Note

Tip

Create the Atlas Vector Search Index

Required Access

Procedure

In Atlas, go to the .css-h15tq0{font-style:normal;font-weight:700;}Clusters page for your project.

Go to the Atlas Search page for your cluster.

Define the Atlas Vector Search index.

Review the index definition and then click Create Search Index.

Click Close to close the You're All Set! modal window.

In your get-started.js file, add the following code.

Run Vector Search Queries

Note

Tip

See also:

Answer Questions on Your Data

Next Steps

Tip

See also:

Update your `package.json` file.

Create a file named `get-started.js` and paste the following code.

In Atlas, go to the Clusters page for your project.

In your `get-started.js` file, add the following code.