Docs 菜单

开始使用 LangChainGo 集成

您可以将Atlas Vector Search与 LangChainGo 集成,以构建大型语言模型 (LLM) 应用程序并实现检索增强生成 (RAG)。本教程演示如何开始将Atlas Vector Search与 LangChainGo 结合使用,对数据执行语义搜索并构建RAG实施。具体来说,您执行以下操作:

  1. 设置环境。

  2. 在 Atlas 上存储自定义数据。

  3. 在您的数据上创建一个 Atlas Vector Search 索引。

  4. 运行以下向量搜索查询:

    • 语义搜索。

    • 带元数据预过滤的语义搜索。

  5. 使用 Atlas Vector Search 来回答有关数据的问题,从而实施RAG

LangChainGo 是 LangChain 的Go编程语言实施。它是社区驱动的 LangChain框架的第三方端口。

LangChain 是一个开源框架,可通过使用“链”来简化LLM应用程序的创建。 链是 LangChain 特有的组件,可组合用于各种 AI 使用案例,包括RAG

通过将Atlas Vector Search与 LangChain 集成,您可以将Atlas用作向量数据库,并使用Atlas Vector Search从数据中检索语义相似的文档来实现RAG。要学习;了解有关 RAG 的更多信息,请参阅 使用Atlas Vector Search进行检索增强生成 (RAG)。

LangChainGo 促进了AI应用程序的法学硕士编排,将 LangChain 的功能带入Go生态系统。它还允许开发者使用向量存储连接到他们首选的数据库,包括MongoDB。

如要完成本教程,您必须具备以下条件:

  • 一个 Atlas 帐户,而其集群运行着 MongoDB 版本 6.0.11、7.0.2 或更高版本(包括 RC)。确保您的 IP 地址包含在 Atlas 项目的访问列表中。如需了解详情,请参阅创建集群

  • 一个 OpenAI API 密钥。您必须拥有一个 OpenAI 账号,该账号具有可用于 API 请求的信用额度。要了解有关注册 OpenAI 账号的更多信息,请参阅 OpenAI API 网站

  • 用于运行 Go 项目的终端和代码编辑器。

  • 在您的机器上安装Go

您必须首先为本教程设立环境。请完成以下步骤以设立您的环境。

1

在终端中运行以下命令,创建名为 langchaingo-mongodb的新目录并初始化项目:

mkdir langchaingo-mongodb
cd langchaingo-mongodb
go mod init langchaingo-mongodb
2

运行以下命令:

go get github.com/joho/godotenv
go get github.com/tmc/langchaingo/chains
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector
go get go.mongodb.org/mongo-driver/v2/mongo
go mod tidy
3

langchaingo-mongodb项目目录中,创建 .env文件并添加以下行:

OPENAI_API_KEY="<api-key>"
ATLAS_CONNECTION_STRING="<connection-string>"

将占位符值替换为您的 OpenAI API密钥和Atlas 集群的 SRV连接字符串。连接字符串应使用以下格式:

mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>
4

langchaingo-mongodb项目目录中,创建一个名为 main.go 的文件。在整个教程中,您将向此文件添加代码。

在本部分中,您将定义一个异步函数以将自定义数据加载到Atlas中,并将Atlas实例化为向量数据库(也称为向量存储)。

1

将以下导入添加到 main.go文件的顶部。

package main
import (
"context"
"log"
"os"
"github.com/joho/godotenv"
"github.com/tmc/langchaingo/embeddings"
"github.com/tmc/langchaingo/llms/openai"
"github.com/tmc/langchaingo/schema"
"github.com/tmc/langchaingo/vectorstores/mongovector"
"go.mongodb.org/mongo-driver/v2/mongo"
"go.mongodb.org/mongo-driver/v2/mongo/options"
)
2

以下代码执行这些操作:

  • 通过指定以下内容,将Atlas配置为向量存储:

    • langchaingo_db.test 作为Atlas中的集合,用于存储文档。

    • vector_index 作为用于查询向量存储的索引

    • text 作为包含原始文本内容的字段的名称。

    • embedding 作为包含向量嵌入的字段的名称。

  • 通过执行以下操作来准备自定义数据:

    • 为每个文档定义文本。

    • 使用 LangChainGo 的 mongovector包生成文本的嵌入。此包将文档嵌入存储在MongoDB中,并支持对存储的嵌入进行搜索。

    • 构造包含 文本、嵌入内容 和元数据 的文档。

  • 将构建的文档摄入Atlas并实例化向量存储。

将以下代码粘贴到 main.go文件中:

// Defines the document structure
type Document struct {
PageContent string `bson:"text"`
Embedding []float32 `bson:"embedding"`
Metadata map[string]string `bson:"metadata"`
}
func main() {
const (
openAIEmbeddingModel = "text-embedding-3-small"
openAIEmbeddingDim = 1536
similarityAlgorithm = "dotProduct"
indexName = "vector_index"
databaseName = "langchaingo_db"
collectionName = "test"
)
if err := godotenv.Load(); err != nil {
log.Fatal("No .env file found")
}
// Loads the MongoDB URI from environment
uri := os.Getenv("ATLAS_CONNECTION_STRING")
if uri == "" {
log.Fatal("Set your 'ATLAS_CONNECTION_STRING' environment variable in the .env file")
}
// Loads the API key from environment
apiKey := os.Getenv("OPENAI_API_KEY")
if apiKey == "" {
log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
}
// Connects to MongoDB Atlas
client, err := mongo.Connect(options.Client().ApplyURI(uri))
if err != nil {
log.Fatalf("Failed to connect to server: %v", err)
}
defer func() {
if err := client.Disconnect(context.Background()); err != nil {
log.Fatalf("Error disconnecting the client: %v", err)
}
}()
log.Println("Connected to MongoDB Atlas.")
// Selects the database and collection
coll := client.Database(databaseName).Collection(collectionName)
// Creates an OpenAI LLM embedder client
llm, err := openai.New(openai.WithEmbeddingModel(openAIEmbeddingModel))
if err != nil {
log.Fatalf("Failed to create an embedder client: %v", err)
}
// Creates an embedder from the embedder client
embedder, err := embeddings.NewEmbedder(llm)
if err != nil {
log.Fatalf("Failed to create an embedder: %v", err)
}
// Creates a new MongoDB Atlas vector store
store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings"))
// Checks if the collection is empty, and if empty, adds documents to the MongoDB Atlas database vector store
if isCollectionEmpty(coll) {
documents := []schema.Document{
{
PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.",
Metadata: map[string]any{
"author": "A",
"type": "post",
},
},
{
PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.",
Metadata: map[string]any{
"author": "B",
"type": "post",
},
},
{
PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.",
Metadata: map[string]any{
"author": "C",
"type": "post",
},
},
}
_, err := store.AddDocuments(context.Background(), documents)
if err != nil {
log.Fatalf("Error adding documents: %v", err)
}
log.Printf("Successfully added %d documents to the collection.\n", len(documents))
} else {
log.Println("Documents already exist in the collection, skipping document addition.")
}
}
func isCollectionEmpty(coll *mongo.Collection) bool {
count, err := coll.EstimatedDocumentCount(context.Background())
if err != nil {
log.Fatalf("Failed to count documents in the collection: %v", err)
}
return count == 0
}
3

保存文件,然后运行以下命令将数据加载到 Atlas。

go run main.go
Connected to MongoDB Atlas.
Successfully added 3 documents to the collection.

提示

运行 main.go 后,您可以通过导航到集群中的 langchaingo_db.test 集合在 Atlas 用户界面中查看矢量嵌入。

注意

要创建 Atlas Vector Search 索引,您必须对 Atlas 项目具有Project Data Access Admin或更高访问权限。

要在向量存储上启用向量搜索查询,请在langchaingo_db.test集合上创建 Atlas Vector Search 索引。

将以下导入添加到 main.go文件的顶部:

import (
// Other imports...
"fmt"
"time"
"go.mongodb.org/mongo-driver/v2/bson"
)

main.go文件中的 main() 函数之外定义以下函数。这些函数可为MongoDB集合创建和管理向量搜索索引:

  1. SearchIndexExists 函数检查具有指定名称的搜索索引是否存在且可查询。

  2. CreateVectorSearchIndex 函数在指定集合上创建向量搜索索引。此函数会阻塞,直到索引创建完成且可查询。

// Checks if the search index exists
func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) {
log.Println("Checking if search index exists.")
view := coll.SearchIndexes()
siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch")
cursor, err := view.List(ctx, siOpts)
if err != nil {
return false, fmt.Errorf("failed to list search indexes: %w", err)
}
for cursor.Next(ctx) {
index := struct {
Name string `bson:"name"`
Queryable bool `bson:"queryable"`
}{}
if err := cursor.Decode(&index); err != nil {
return false, fmt.Errorf("failed to decode search index: %w", err)
}
if index.Name == idx && index.Queryable {
return true, nil
}
}
if err := cursor.Err(); err != nil {
return false, fmt.Errorf("cursor error: %w", err)
}
return false, nil
}
// Creates a vector search index. This function blocks until the index has been
// created.
func CreateVectorSearchIndex(
ctx context.Context,
coll *mongo.Collection,
idxName string,
openAIEmbeddingDim int,
similarityAlgorithm string,
) (string, error) {
type vectorField struct {
Type string `bson:"type,omitempty"`
Path string `bson:"path,omitempty"`
NumDimensions int `bson:"numDimensions,omitempty"`
Similarity string `bson:"similarity,omitempty"`
}
fields := []vectorField{
{
Type: "vector",
Path: "embeddings",
NumDimensions: openAIEmbeddingDim,
Similarity: similarityAlgorithm,
},
{
Type: "filter",
Path: "metadata.author",
},
{
Type: "filter",
Path: "metadata.type",
},
}
def := struct {
Fields []vectorField `bson:"fields"`
}{
Fields: fields,
}
log.Println("Creating vector search index...")
view := coll.SearchIndexes()
siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch")
searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts})
if err != nil {
return "", fmt.Errorf("failed to create the search index: %w", err)
}
// Awaits the creation of the index
var doc bson.Raw
for doc == nil {
cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName))
if err != nil {
return "", fmt.Errorf("failed to list search indexes: %w", err)
}
if !cursor.Next(ctx) {
break
}
name := cursor.Current.Lookup("name").StringValue()
queryable := cursor.Current.Lookup("queryable").Boolean()
if name == searchName && queryable {
doc = cursor.Current
} else {
time.Sleep(5 * time.Second)
}
}
return searchName, nil
}

通过调用 main() 函数中的上述函数来创建向量存储集合和索引。将以下代码添加到 main() 函数的末尾:

// SearchIndexExists will return true if the provided index is defined for the
// collection. This operation blocks until the search completes.
if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok {
// Creates the vector store collection
err = client.Database(databaseName).CreateCollection(context.Background(), collectionName)
if err != nil {
log.Fatalf("failed to create vector store collection: %v", err)
}
_, err = CreateVectorSearchIndex(context.Background(), coll, indexName, openAIEmbeddingDim, similarityAlgorithm)
if err != nil {
log.Fatalf("failed to create index: %v", err)
}
log.Println("Successfully created vector search index.")
} else {
log.Println("Vector search index already exists.")
}

保存文件,然后运行以下命令以创建Atlas Vector Search索引。

go run main.go
Checking if search index exists.
Creating vector search index...
Successfully created vector search index.

提示

运行main.go 后,您可以在Atlas用户界面中导航到集群中的 langchaingo_db.test集合,查看向量搜索索引。

本部分演示了可以对矢量化数据运行的各种查询。创建索引后,您可以运行向量搜索查询。

选择 Basic Semantic SearchSemantic Search with Filtering 标签页,查看相应的代码。

1

语义搜索检索与查询有意义相关的信息。以下代码使用 SimilaritySearch() 方法对字符串 "Prevent weeds" 执行语义搜索,并将结果限制为第一个文档。

// Performs basic semantic search
docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1)
if err != nil {
fmt.Println("Error performing search:", err)
}
fmt.Println("Semantic Search Results:", docs)
2
go run main.go
Semantic Search Results: [{For a natural lawn, selection of
the right grass type suitable for your climate is crucial.
Balanced watering, generally 1 to 1.5 inches per week, is
important; overwatering invites disease. Opt for organic
fertilizers over synthetic versions to provide necessary
nutrients and improve soil structure. Regular lawn aeration
helps root growth and prevents soil compaction. Practice
natural pest control and consider overseeding to maintain a
dense sward, which naturally combats weeds and pest.
map[author:C type:post] 0.69752026}]

您可以使用 MQL 匹配表达式预先过滤数据,该表达式将索引字段与布尔值、数字或 string 值进行比较。您必须将要过滤的任何元数据字段作为 filter 类型进行索引。要了解详情,请参阅如何为向量搜索建立字段索引。

1

将以下依赖项添加到您的 main.go文件中:

import (
// Other imports...
"github.com/tmc/langchaingo/vectorstores"
)
2

以下代码使用 SimilaritySearch() 方法对字符串 "Tulip care" 执行语义搜索。它指定以下参数:

  • 1 形式返回的文件数。

  • 分数阈值为 0.60

它返回与过滤metadata.type: post 匹配并包含分数阈值的文档。

// Performs semantic search with metadata filter
filter := map[string]interface{}{
"metadata.type": "post",
}
docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1,
vectorstores.WithScoreThreshold(0.60),
vectorstores.WithFilters(filter))
if err != nil {
fmt.Println("Error performing search:", err)
}
fmt.Println("Filter Search Results:", docs)
3
go run main.go
Filter Search Results: [{Proper tuber planting involves site
selection, proper timing, and exceptional care. Choose spots
with well-drained soil and adequate sun exposure. Tubers are
generally planted in spring, but depending on the plant,
timing varies. Always plant with the eyes facing upward at a
depth two to three times the tuber's height. Ensure 4 inch
spacing between small tubers, expand to 12 inches for large
ones. Adequate moisture is needed, yet do not overwater.
Mulching can help preserve moisture and prevent weed growth.
map[author:A type:post] 0.64432365}]

本部分演示使用Atlas Vector Search和 LangChainGo 的 RAG实施。现在您已经使用Atlas Vector Search检索语义相似的文档,使用以下代码示例提示法学硕士回答针对Atlas Vector Search返回的文档的问题。

1

将以下导入添加到 main.go文件的顶部。

import (
// Other imports...
"strings"
"github.com/tmc/langchaingo/chains"
"github.com/tmc/langchaingo/prompts"
)
2

此代码执行以下操作:

  • 将Atlas Vector Search实例化为检索器,以查询语义相似的文档。

  • 定义 LangChainGo 提示模板,指示 LLM 使用检索到的文档作为查询的上下文。 LangChainGo 将这些文档填充到 {{.context}} 输入变量中,并将您的查询填充到 {{.question}} 变量中。

  • 构建一条链,该链使用 OpenAI 的聊天模型,根据提供的提示模板生成上下文感知响应。

  • 向链发送有关面向初学者的绘画的示例查询,使用提示和检索器收集相关上下文。

  • 返回并打印LLM的响应以及用作上下文的文档。

// Implements RAG to answer questions on your data
optionsVector := []vectorstores.Option{
vectorstores.WithScoreThreshold(0.60),
}
retriever := vectorstores.ToRetriever(&store, 1, optionsVector...)
prompt := prompts.NewPromptTemplate(
`Answer the question based on the following context:
{{.context}}
Question: {{.question}}`,
[]string{"context", "question"},
)
llmChain := chains.NewLLMChain(llm, prompt)
ctx := context.Background()
const question = "How do I get started painting?"
documents, err := retriever.GetRelevantDocuments(ctx, question)
if err != nil {
log.Fatalf("Failed to retrieve documents: %v", err)
}
var contextBuilder strings.Builder
for i, document := range documents {
contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent))
}
contextStr := contextBuilder.String()
inputs := map[string]interface{}{
"context": contextStr,
"question": question,
}
out, err := chains.Call(ctx, llmChain, inputs)
if err != nil {
log.Fatalf("Failed to run LLM chain: %v", err)
}
log.Println("Source documents:")
for i, doc := range documents {
log.Printf("Document %d: %s\n", i+1, doc.PageContent)
}
responseText, ok := out["text"].(string)
if !ok {
log.Println("Unexpected response type")
return
}
log.Println("Question:", question)
log.Println("Generated Answer:", responseText)
3

保存文件后,运行以下命令。 生成的响应可能会有所不同。

go run main.go
Source documents:
Document 1: "Successful oil painting necessitates patience,
proper equipment, and technique. Begin with a carefully
prepared, primed canvas. Sketch your composition lightly before
applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over
lean,' meaning each subsequent layer should contain more oil to
prevent cracking. Allow each layer to dry before applying
another. Clean your brushes often and avoid solvents that might
damage them. Finally, always work in a well-ventilated space."
Question: How do I get started painting?
Generated Answer: To get started painting, you should begin with a
carefully prepared, primed canvas. Sketch your composition lightly
before applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over lean,'
meaning each subsequent layer should contain more oil to prevent
cracking. Allow each layer to dry before applying another. Clean
your brushes often and avoid solvents that might damage them.
Finally, always work in a well-ventilated space.

完成本教程后,您已成功将 Atlas Vector Search 与 LangChainGo 集成以构建RAG应用程序。您已完成以下操作:

  • 启动并配置了必要的环境来支持您的应用程序

  • 将自定义数据存储在Atlas中,并将Atlas实例化为向量存储

  • 基于数据构建Atlas Vector Search索引,支持语义搜索功能

  • 使用向量嵌入来检索语义相关的数据

  • 通过合并元元数据筛选器增强搜索结果

  • 使用Atlas Vector Search实施 RAG 工作流程,根据您的数据为问题提供有意义的答案

MongoDB 还提供以下开发者资源:

另请参阅:

要学习;了解有关集成 LangChainGo、OpenAI 和MongoDB的更多信息,请参阅使用MongoDB Atlas作为具有 OpenAI 嵌入的向量存储。