Get Started with the LangChainGo Integration
You can integrate Atlas Vector Search with LangChainGo to build large language model (LLM) applications and implement retrieval-augmented generation (RAG). This tutorial demonstrates how to start using Atlas Vector Search with LangChainGo to perform semantic search on your data and build a RAG implementation. Specifically, you perform the following actions:
设置环境。
在 Atlas 上存储自定义数据。
在您的数据上创建一个 Atlas Vector Search 索引。
运行以下向量搜索查询:
语义搜索。
带元数据预过滤的语义搜索。
使用 Atlas Vector Search 来回答有关数据的问题,从而实施RAG 。
背景
LangChainGo is the Go programming language implementation of LangChain. It is a community-driven, third-party port of the LangChain framework.
LangChain 是一个开源框架,它通过使用“链”简化了 LLM 应用程序的创建。链是特定于 Langchain 的组件,可以组合用于各种 AI 用例,包括 RAG。
通过将 Atlas Vector Search 与 LangChain 集成,您可以将 Atlas 用作向量数据库,并使用 Atlas Vector Search 通过从数据中检索语义相似的文档来实现 RAG。要了解有关 RAG 的更多信息,请参阅使用 Atlas Vector Search 进行检索增强生成 (RAG)。
LangChainGo facilitates the orchestration of LLMs for AI applications, bringing the capabilities of LangChain into the Go ecosystem. It also allows developers to connect to their preferred databases using vector stores, including MongoDB.
先决条件
如要完成本教程,您必须具备以下条件:
一个 Atlas 帐户,其集群运行 MongoDB 6.0.11, 7.0.2 或更高版本(包括 RC)。确保您的 IP 地址包含在 Atlas 项目的访问列表中。要了解更多信息,请参阅创建集群。
一个 OpenAI API 密钥。您必须拥有一个 OpenAI 账号,该账号具有可用于 API 请求的信用额度。要了解有关注册 OpenAI 账号的更多信息,请参阅 OpenAI API 网站。
用于运行 Go 项目的终端和代码编辑器。
Go installed on your machine.
设置环境
You must first set up the environment for this tutorial. Complete the following steps to set up your environment.
Initialize your environment variables.
In your langchaingo-mongodb
project directory, create a .env
file
and add the following lines:
OPENAI_API_KEY="<api-key>" ATLAS_CONNECTION_STRING="<connection-string>"
Replace the placeholder values with your OpenAI API Key and the SRV connection string for your Atlas cluster. Your connection string should use the following format:
mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>
使用 Atlas 作为向量存储
In this section, you define an asynchronous function to load custom data into Atlas and instantiate Atlas as a vector database, also called a vector store.
Import the following dependencies.
Add the following imports to the top of your main.go
file.
package main import ( "context" "log" "os" "github.com/joho/godotenv" "github.com/tmc/langchaingo/embeddings" "github.com/tmc/langchaingo/llms/openai" "github.com/tmc/langchaingo/schema" "github.com/tmc/langchaingo/vectorstores/mongovector" "go.mongodb.org/mongo-driver/v2/mongo" "go.mongodb.org/mongo-driver/v2/mongo/options" )
Define the vector store details.
The following code performs these actions:
Configures Atlas as a vector store by specifying the following:
langchaingo_db.test
as the collection in Atlas to store the documents.vector_index
as the index to use for querying the vector store.text
作为包含原始文本内容的字段名称。embedding
as the name of the field containing the vector embeddings.
通过执行以下操作来准备自定义数据:
Defines text for each document.
Uses LangChainGo's
mongovector
package to generate embeddings for the texts. This package stores document embeddings in MongoDB and enables searches on stored embeddings.Constructs documents that include text, embeddings, and metadata.
Ingests the constructed documents into Atlas and instantiates the vector store.
Paste the following code into your main.go
file:
// Defines the document structure type Document struct { PageContent string `bson:"text"` Embedding []float32 `bson:"embedding"` Metadata map[string]string `bson:"metadata"` } func main() { const ( openAIEmbeddingModel = "text-embedding-3-small" openAIEmbeddingDim = 1536 similarityAlgorithm = "dotProduct" indexName = "vector_index" databaseName = "langchaingo_db" collectionName = "test" ) if err := godotenv.Load(); err != nil { log.Fatal("No .env file found") } // Loads the MongoDB URI from environment uri := os.Getenv("ATLAS_CONNECTION_STRING") if uri == "" { log.Fatal("Set your 'ATLAS_CONNECTION_STRING' environment variable in the .env file") } // Loads the API key from environment apiKey := os.Getenv("OPENAI_API_KEY") if apiKey == "" { log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file") } // Connects to MongoDB Atlas client, err := mongo.Connect(options.Client().ApplyURI(uri)) if err != nil { log.Fatalf("Failed to connect to server: %v", err) } defer func() { if err := client.Disconnect(context.Background()); err != nil { log.Fatalf("Error disconnecting the client: %v", err) } }() log.Println("Connected to MongoDB Atlas.") // Selects the database and collection coll := client.Database(databaseName).Collection(collectionName) // Creates an OpenAI LLM embedder client llm, err := openai.New(openai.WithEmbeddingModel(openAIEmbeddingModel)) if err != nil { log.Fatalf("Failed to create an embedder client: %v", err) } // Creates an embedder from the embedder client embedder, err := embeddings.NewEmbedder(llm) if err != nil { log.Fatalf("Failed to create an embedder: %v", err) } // Creates a new MongoDB Atlas vector store store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings")) // Checks if the collection is empty, and if empty, adds documents to the MongoDB Atlas database vector store if isCollectionEmpty(coll) { documents := []schema.Document{ { PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.", Metadata: map[string]any{ "author": "A", "type": "post", }, }, { PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.", Metadata: map[string]any{ "author": "B", "type": "post", }, }, { PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.", Metadata: map[string]any{ "author": "C", "type": "post", }, }, } _, err := store.AddDocuments(context.Background(), documents) if err != nil { log.Fatalf("Error adding documents: %v", err) } log.Printf("Successfully added %d documents to the collection.\n", len(documents)) } else { log.Println("Documents already exist in the collection, skipping document addition.") } } func isCollectionEmpty(coll *mongo.Collection) bool { count, err := coll.EstimatedDocumentCount(context.Background()) if err != nil { log.Fatalf("Failed to count documents in the collection: %v", err) } return count == 0 }
Run your Go project.
保存文件,然后运行以下命令将数据加载到 Atlas。
go run main.go
Connected to MongoDB Atlas. Successfully added 3 documents to the collection.
提示
运行 main.go
后,您可以通过导航到集群中的 langchaingo_db.test
集合在 Atlas 用户界面中查看矢量嵌入。
创建 Atlas Vector Search 索引
注意
要创建 Atlas Vector Search 索引,您必须对 Atlas 项目具有Project Data Access Admin
或更高访问权限。
要在向量存储上启用向量搜索查询,请在langchaingo_db.test
集合上创建 Atlas Vector Search 索引。
Add the following imports to the top of your main.go
file:
import ( // Other imports... "fmt" "time" "go.mongodb.org/mongo-driver/v2/bson" )
Define the following functions in your main.go
file outside of your
main()
function. These functions create and manage a vector search index for
your MongoDB collection:
The
SearchIndexExists
function checks if a search index with the specified name exists and is queryable.The
CreateVectorSearchIndex
functions creates a vector search index on the specified collection. This function blocks until the index is created and queryable.
// Checks if the search index exists func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) { log.Println("Checking if search index exists.") view := coll.SearchIndexes() siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch") cursor, err := view.List(ctx, siOpts) if err != nil { return false, fmt.Errorf("failed to list search indexes: %w", err) } for cursor.Next(ctx) { index := struct { Name string `bson:"name"` Queryable bool `bson:"queryable"` }{} if err := cursor.Decode(&index); err != nil { return false, fmt.Errorf("failed to decode search index: %w", err) } if index.Name == idx && index.Queryable { return true, nil } } if err := cursor.Err(); err != nil { return false, fmt.Errorf("cursor error: %w", err) } return false, nil } // Creates a vector search index. This function blocks until the index has been // created. func CreateVectorSearchIndex( ctx context.Context, coll *mongo.Collection, idxName string, openAIEmbeddingDim int, similarityAlgorithm string, ) (string, error) { type vectorField struct { Type string `bson:"type,omitempty"` Path string `bson:"path,omitempty"` NumDimensions int `bson:"numDimensions,omitempty"` Similarity string `bson:"similarity,omitempty"` } fields := []vectorField{ { Type: "vector", Path: "embeddings", NumDimensions: openAIEmbeddingDim, Similarity: similarityAlgorithm, }, { Type: "filter", Path: "metadata.author", }, { Type: "filter", Path: "metadata.type", }, } def := struct { Fields []vectorField `bson:"fields"` }{ Fields: fields, } log.Println("Creating vector search index...") view := coll.SearchIndexes() siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch") searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts}) if err != nil { return "", fmt.Errorf("failed to create the search index: %w", err) } // Awaits the creation of the index var doc bson.Raw for doc == nil { cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName)) if err != nil { return "", fmt.Errorf("failed to list search indexes: %w", err) } if !cursor.Next(ctx) { break } name := cursor.Current.Lookup("name").StringValue() queryable := cursor.Current.Lookup("queryable").Boolean() if name == searchName && queryable { doc = cursor.Current } else { time.Sleep(5 * time.Second) } } return searchName, nil }
Create the vector store collection and index by calling the preceding functions
in your main()
function. Add the following code to the end of your
main()
function:
// SearchIndexExists will return true if the provided index is defined for the // collection. This operation blocks until the search completes. if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok { // Creates the vector store collection err = client.Database(databaseName).CreateCollection(context.Background(), collectionName) if err != nil { log.Fatalf("failed to create vector store collection: %v", err) } _, err = CreateVectorSearchIndex(context.Background(), coll, indexName, openAIEmbeddingDim, similarityAlgorithm) if err != nil { log.Fatalf("failed to create index: %v", err) } log.Println("Successfully created vector search index.") } else { log.Println("Vector search index already exists.") }
Save the file, then run the following command to create your Atlas Vector Search index.
go run main.go
Checking if search index exists. Creating vector search index... Successfully created vector search index.
提示
After running main.go
, you can view your vector search index in the
Atlas UI by navigating to the
langchaingo_db.test
collection in your cluster.
运行向量搜索查询
This section demonstrates various queries that you can run on your vectorized data. Now that you've created the index, you can run vector search queries.
选择 Basic Semantic Search 或 Semantic Search with Filtering 标签页,查看相应的代码。
Add the following code to your main function and save the file.
Semantic search retrieves information that is meaningfully related
to a query. The following code uses the SimilaritySearch()
method to perform a semantic search for the string "Prevent
weeds"
and limits the results to the first document.
// Performs basic semantic search docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1) if err != nil { fmt.Println("Error performing search:", err) } fmt.Println("Semantic Search Results:", docs)
运行以下命令以执行查询。
go run main.go
Semantic Search Results: [{For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest. map[author:C type:post] 0.69752026}]
您可以使用MQL匹配表达式对数据进行预过滤,该表达式将索引字段与布尔值、数字值或string值进行比较。 您必须为要作为filter
类型进行筛选的任何元数据字段编制索引。 要了解更多信息,请参阅如何为 Vector Atlas Search的字段编制索引。
Add the following code to your main function and save the file.
以下代码使用 SimilaritySearch()
方法对字符串 "Tulip care"
执行语义搜索。它指定以下参数:
以
1
形式返回的文件数。A score threshold of
0.60
.
It returns the document that matches the filter metadata.type:
post
and includes the score threshold.
// Performs semantic search with metadata filter filter := map[string]interface{}{ "metadata.type": "post", } docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1, vectorstores.WithScoreThreshold(0.60), vectorstores.WithFilters(filter)) if err != nil { fmt.Println("Error performing search:", err) } fmt.Println("Filter Search Results:", docs)
运行以下命令以执行查询。
go run main.go
Filter Search Results: [{Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth. map[author:A type:post] 0.64432365}]
回答有关数据的问题
This section demonstrates a RAG implementation using Atlas Vector Search and LangChainGo. Now that you've used Atlas Vector Search to retrieve semantically similar documents, use the following code example to prompt the LLM to answer questions against the documents returned by Atlas Vector Search.
Add the following code to the end of your main function and save the file.
此代码执行以下操作:
Instantiates Atlas Vector Search as a retriever to query for semantically similar documents.
Defines a LangChainGo prompt template to instruct the LLM to use the retrieved documents as context for your query. LangChainGo populates these documents into the
{{.context}}
input variable and your query into the{{.question}}
variable.Constructs a chain that uses OpenAI's chat model to generate context-aware responses based on the provided prompt template.
Sends a sample query about painting for beginners to the chain, using the prompt and the retriever to gather relevant context.
Returns and prints the LLM's response and the documents used as context.
// Implements RAG to answer questions on your data optionsVector := []vectorstores.Option{ vectorstores.WithScoreThreshold(0.60), } retriever := vectorstores.ToRetriever(&store, 1, optionsVector...) prompt := prompts.NewPromptTemplate( `Answer the question based on the following context: {{.context}} Question: {{.question}}`, []string{"context", "question"}, ) llmChain := chains.NewLLMChain(llm, prompt) ctx := context.Background() const question = "How do I get started painting?" documents, err := retriever.GetRelevantDocuments(ctx, question) if err != nil { log.Fatalf("Failed to retrieve documents: %v", err) } var contextBuilder strings.Builder for i, document := range documents { contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent)) } contextStr := contextBuilder.String() inputs := map[string]interface{}{ "context": contextStr, "question": question, } out, err := chains.Call(ctx, llmChain, inputs) if err != nil { log.Fatalf("Failed to run LLM chain: %v", err) } log.Println("Source documents:") for i, doc := range documents { log.Printf("Document %d: %s\n", i+1, doc.PageContent) } responseText, ok := out["text"].(string) if !ok { log.Println("Unexpected response type") return } log.Println("Question:", question) log.Println("Generated Answer:", responseText)
运行以下命令以执行您的文件。
保存文件后,运行以下命令。 生成的响应可能会有所不同。
go run main.go
Source documents: Document 1: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space." Question: How do I get started painting? Generated Answer: To get started painting, you should begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.
After completing this tutorial, you have successfully integrated Atlas Vector Search with LangChainGo to build a RAG application. You have accomplished the following:
Initiated and configured the necessary environment to support your application
Stored custom data in Atlas and instantiated Atlas as a vector store
Built an Atlas Vector Search index on your data, enabling semantic search capabilities
Used vector embeddings to retrieve semantically relevant data
Enhanced search results by incorporating metadata filters
Implemented a RAG workflow using Atlas Vector Search to provide meaningful answers to questions based on your data
后续步骤
To learn more about getting started with Atlas Vector Search, see the Atlas Vector Search 快速入门, and then select Go from the drop-down menu.
To learn more about vector embeddings, see 如何创建向量嵌入, and then select Go from the drop-down menu.
To learn how to integrate LangChainGo and Hugging Face, see 使用 Atlas Vector Search 进行检索增强生成 (RAG).
要了解如何在不需要 API 密钥或积分的情况下实现 RAG,请参阅使用 Atlas Vector Search 构建本地 RAG 实现。
MongoDB 还提供以下开发者资源:
另请参阅:
To learn more about integrating LangChainGo, OpenAI, and MongoDB, see Using MongoDB Atlas as a Vector Store with OpenAI Embeddings.