Docs 菜单
Docs 主页
/
MongoDB Atlas
/

使用 Atlas Vector Search 进行检索增强生成 (RAG)

在此页面上

  • 为何使用 RAG?
  • 使用 Atlas Vector Search 进行 RAG
  • 接收
  • Retrieval
  • 生成
  • 开始体验
  • 先决条件
  • 步骤
  • 后续步骤
  • 微调

检索增强生成 (RAG) 是一种架构,用于使用额外数据增强大型语言模型 (LLM),使其能够生成更准确的响应。通过将 LLM 与由 Atlas Vector Search 提供支持的检索系统相结合,您可以在生成式人工智能应用程序中实现 RAG

开始体验

使用 LLM 工作时,您可能会遇到以下限制:

  • 过时数据:LLM 是在特定时间点之前的静态数据集上训练的。这意味着他们的知识库有限,并且可能会使用过时的数据。

  • 无权访问本地数据:LLM 无权访问本地或个性化数据。因此,他们可能缺乏有关特定领域的知识。

  • 幻觉:当训练数据不完整或过时时,LLM 会生成不准确的信息。

您可以按以下步骤实现 RAG 来解决这些限制:

  1. 导入:将自定义数据作为向量嵌入存储在向量数据库中,例如 MongoDB Atlas。这样,您就可以创建一个包含最新个性化数据的知识库。

  2. 检索:使用 Atlas Vector Search 等搜索解决方案,根据用户的问题从数据库中检索语义相似的文档。这些文档为 LLM 提供了额外的相关数据。

  3. 生成:提示 LLMLLM 使用检索到的文档作为上下文来生成更准确和更相关的响应,从而减少幻觉。

由于 RAG 支持问答和文本生成等任务,因此它是构建提供个性化、特定领域响应的 AI 聊天机器人的有效架构。要创建可用于生产的聊天机器人,您必须配置一个服务器来路由请求并在 RAG 实现的基础上构建用户界面。

要使用 Atlas Vector Search 实现RAG,需要将数据输入 Atlas,使用 Atlas Vector Search 检索文档,并使用LLM 生成响应。本部分介绍使用 Atlas Vector Search 实现的基本或简单的 RAG 组件。有关逐步说明,请参阅 开始使用。

带有 Atlas Vector Search 的 RAG 流程图
点击放大

RAG 的数据导入涉及处理您的自定义数据并将其存储在矢量数据库中,以便为检索做好准备。要使用 Atlas 作为向量数据库创建基本导入管道,请执行以下操作:

  1. 加载数据。

    使用文档加载器等工具加载不同数据格式和位置的数据。

  2. 将数据拆分为数据段。

    对数据进行处理或分块。分块是指将数据分割成较小的部分,以提高性能。

  3. 将数据转换为向量嵌入。

    使用嵌入式模型将数据转换为向量嵌入 。要了解更多信息,请参阅如何创建向量嵌入。

  4. 将数据和嵌入存储在 Atlas 中。

    将这些嵌入存储在 Atlas 中。您可以将嵌入作为字段与集合中的其他数据一起存储。

  1. 加载数据。

    使用文档加载器和解析器等工具从不同数据格式和位置加载数据。

  2. 将解析的数据拆分为数据段。

    对数据进行处理或分块。分块是指将数据分割成较小的部分,以提高性能。

  3. 将数据转换为向量嵌入。

    使用嵌入模型将数据转换为向量嵌入。要学习;了解更多信息,请参阅如何创建向量嵌入。

  4. 将数据和嵌入存储在 Atlas 中。

    将这些嵌入存储在 Atlas 中。您可以将嵌入作为字段与集合中的其他数据一起存储。

  1. 加载数据。

    使用文档加载器数据连接器等工具,从不同的数据格式和位置加载数据。

  2. 将数据拆分为数据段。

    对数据进行处理或分块。分块是指将数据分割成较小的部分,以提高性能。

  3. 将数据转换为向量嵌入。

    使用嵌入式模型将数据转换为向量嵌入 。要了解更多信息,请参阅如何创建向量嵌入。

  4. 将数据和嵌入存储在 Atlas 中。

    将这些嵌入存储在 Atlas 中。您可以将嵌入作为字段与集合中的其他数据一起存储。

  1. 加载数据。

    使用文档加载器数据连接器等工具,从不同的数据格式和位置加载数据。

  2. 将数据拆分为数据段。

    对数据进行处理或分块。分块是指将数据分割成较小的部分,以提高性能。

  3. 将数据转换为向量嵌入。

    使用嵌入式模型将数据转换为向量嵌入 。要了解更多信息,请参阅如何创建向量嵌入。

  4. 将数据和嵌入存储在 Atlas 中。

    将这些嵌入存储在 Atlas 中。您可以将嵌入作为字段与集合中的其他数据一起存储。

构建涉及从向量数据库中搜索并返回最相关文档的检索系统,以增强 LLM。要使用 Atlas Vector Search 检索相关文档,您需要将用户的问题转换为向量嵌入,然后对 Atlas 中的数据运行向量搜索查询,以查找具有最相似嵌入的文档。

要使用 Atlas Vector Search 执行基本检索,请执行以下操作:

  1. 在包含向量嵌入的集合上定义 Atlas Vector Search 索引

  2. 根据用户的问题选择以下方法之一来检索文档:

要生成响应,请将您的检索系统与 LLM 相结合。在执行向量搜索以检索相关文档后,可以将用户的问题和相关文档作为上下文提供给 LLM,这样它就可以生成更准确的答案。

选择以下方法之一来连接到 LLM

了解如何使用Atlas Vector Search开发 RAG 系统。

持续时间:1.16 分钟

以下示例演示如何使用由 Atlas Vector Search 和 Hugging Face 的开源模型提供支持的检索系统实现 RAG


➤ 使用 Select your language(选择您的语言)下拉菜单设置此页面上示例的语言。


提示

要完成此示例,您必须具备以下条件:

  • 一个Atlas账户,其集群运行MongoDB 6.0.11、7.0.2 或更高版本(包括 RC)。确保您的IP解决包含在Atlas项目的访问权限列表中。要学习;了解更多信息,请参阅创建集群。

  • 一个Atlas账户,其集群运行MongoDB 6.0.11、7.0.2 或更高版本(包括 RC)。确保您的IP解决包含在Atlas项目的访问权限列表中。要学习;了解更多信息,请参阅创建集群。

  • 具有读取访问权限的 Hugging Face Access Token

  • 运行交互式 Python 笔记本(例如 Colab)的环境。

    注意

    如果使用 Colab,请确保笔记本会话的 IP 地址包含在 Atlas 项目的访问列表中。

1
  1. 初始化您的 Go 项目。

    在终端中运行以下命令,创建名为 rag-mongodb的新目录并初始化项目:

    mkdir rag-mongodb
    cd rag-mongodb
    go mod init rag-mongodb
  2. 安装并导入依赖项。

    运行以下命令:

    go get github.com/joho/godotenv
    go get go.mongodb.org/mongo-driver/mongo
    go get github.com/tmc/langchaingo/llms
    go get github.com/tmc/langchaingo/documentloaders
    go get github.com/tmc/langchaingo/embeddings/huggingface
    go get github.com/tmc/langchaingo/llms/huggingface
    go get github.com/tmc/langchaingo/prompts
  3. 创建.env文件。

    在项目中,创建 .env 文件来存储 Atlas 连接字符串和 Hugging Face 访问令牌。

    .env
    HUGGINGFACEHUB_API_TOKEN = "<access-token>"
    ATLAS_CONNECTION_STRING = "<connection-string>"

<access-token> 占位符值替换为您的 Huging Face访问权限令牌。

用 Atlas 集群的 SRV 连接字符串替换 <connection-string> 占位符值。

连接字符串应使用以下格式:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
2

在本部分中,您将创建一个函数,该函数:

  • 从 Hugging Face 的模型中心加载 mxbai-embed-large-v1 嵌入模型。

  • 从输入的数据创建向量嵌入。

  1. 运行以下命令以创建一个存储常用函数的目录,其中包括您将重用用于创建嵌入的目录。

    mkdir common && cd common
  2. common 目录中创建一个名为 get-embeddings.go 的文件,并将以下代码粘贴到其中:

    get-embeddings.go
    package common
    import (
    "context"
    "log"
    "github.com/tmc/langchaingo/embeddings/huggingface"
    )
    func GetEmbeddings(documents []string) [][]float32 {
    hf, err := huggingface.NewHuggingface(
    huggingface.WithModel("mixedbread-ai/mxbai-embed-large-v1"),
    huggingface.WithTask("feature-extraction"))
    if err != nil {
    log.Fatalf("failed to connect to Hugging Face: %v", err)
    }
    embs, err := hf.EmbedDocuments(context.Background(), documents)
    if err != nil {
    log.Fatalf("failed to generate embeddings: %v", err)
    }
    return embs
    }
3

在本节中,您将把 LLM 无法访问的示例数据摄入到 Atlas。以下代码使用适用于 LangChain 的 Go 库Go 驱动程序来执行以下操作:

  • 创建一个包含 MongoDB 收益报告的 HTML 文件。

  • 将数据拆分为数据段,并指定数据块大小(字符数)和数据段数据块重叠(连续数据块之间的重叠字符数)。

  • 使用您定义的 GetEmbeddings 函数从分块数据创建向量嵌入。

  • 将这些嵌入与分块数据一起存储在 Atlas 集群的 rag_db.test 集合中。

  1. 导航到 rag-mongodb 项目目录的根目录。

  2. 在您的项目中创建一个名为 ingest-data.go 的文件,并将以下代码粘贴到其中:

    ingest-data.go
    package main
    import (
    "context"
    "fmt"
    "io"
    "log"
    "net/http"
    "os"
    "rag-mongodb/common" // Module that contains the embedding function
    "github.com/joho/godotenv"
    "github.com/tmc/langchaingo/documentloaders"
    "github.com/tmc/langchaingo/textsplitter"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
    )
    type DocumentToInsert struct {
    PageContent string `bson:"pageContent"`
    Embedding []float32 `bson:"embedding"`
    }
    func downloadReport(filename string) {
    _, err := os.Stat(filename)
    if err == nil {
    return
    }
    url := "https://investors.mongodb.com/node/12236"
    fmt.Println("Downloading ", url, " to ", filename)
    resp, err := http.Get(url)
    if err != nil {
    log.Fatalf("failed to connect to download the report: %v", err)
    }
    defer func() { _ = resp.Body.Close() }()
    f, err := os.Create(filename)
    if err != nil {
    return
    }
    defer func() { _ = f.Close() }()
    _, err = io.Copy(f, resp.Body)
    if err != nil {
    log.Fatalf("failed to copy the report: %v", err)
    }
    }
    func main() {
    ctx := context.Background()
    filename := "investor-report.html"
    downloadReport(filename)
    f, err := os.Open(filename)
    if err != nil {
    defer func() { _ = f.Close() }()
    log.Fatalf("failed to open the report: %v", err)
    }
    defer func() { _ = f.Close() }()
    html := documentloaders.NewHTML(f)
    split := textsplitter.NewRecursiveCharacter()
    split.ChunkSize = 400
    split.ChunkOverlap = 20
    docs, err := html.LoadAndSplit(context.Background(), split)
    if err != nil {
    log.Fatalf("failed to chunk the HTML into documents: %v", err)
    }
    fmt.Printf("Successfully chunked the HTML into %v documents.\n", len(docs))
    if err := godotenv.Load(); err != nil {
    log.Fatal("no .env file found")
    }
    // Connect to your Atlas cluster
    uri := os.Getenv("ATLAS_CONNECTION_STRING")
    if uri == "" {
    log.Fatal("set your 'ATLAS_CONNECTION_STRING' environment variable.")
    }
    clientOptions := options.Client().ApplyURI(uri)
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil {
    log.Fatalf("failed to connect to the server: %v", err)
    }
    defer func() { _ = client.Disconnect(ctx) }()
    // Set the namespace
    coll := client.Database("rag_db").Collection("test")
    fmt.Println("Generating embeddings.")
    var pageContents []string
    for i := range docs {
    pageContents = append(pageContents, docs[i].PageContent)
    }
    embeddings := common.GetEmbeddings(pageContents)
    docsToInsert := make([]interface{}, len(embeddings))
    for i := range embeddings {
    docsToInsert[i] = DocumentToInsert{
    PageContent: pageContents[i],
    Embedding: embeddings[i],
    }
    }
    result, err := coll.InsertMany(ctx, docsToInsert)
    if err != nil {
    log.Fatalf("failed to insert documents: %v", err)
    }
    fmt.Printf("Successfully inserted %v documents into Atlas\n", len(result.InsertedIDs))
    }
  3. 运行以下命令来执行代码:

    go run ingest-data.go
    Successfully chunked the HTML into 163 documents.
    Generating embeddings.
    Successfully inserted document with id: &{ObjectID("66faffcd60da3f6d4f990fa4")}
    Successfully inserted document with id: &{ObjectID("66faffce60da3f6d4f990fa5")}
    ...
4

在本节中,您将设置 Atlas Vector Search,以便从向量数据库中检索文档。然后完成以下步骤:

  1. 在向量嵌入上创建 Atlas Vector Search 索引。

    创建一个名为 rag-vector-index.go 的新文件并粘贴以下代码。此代码连接到您的 Atlas 集群并在 rag_db.test 集合上创建 vectorSearch 类型的索引。

    rag-vector-index.go
    package main
    import (
    "context"
    "log"
    "os"
    "time"
    "go.mongodb.org/mongo-driver/bson"
    "github.com/joho/godotenv"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
    )
    func main() {
    ctx := context.Background()
    if err := godotenv.Load(); err != nil {
    log.Fatal("no .env file found")
    }
    // Connect to your Atlas cluster
    uri := os.Getenv("ATLAS_CONNECTION_STRING")
    if uri == "" {
    log.Fatal("set your 'ATLAS_CONNECTION_STRING' environment variable.")
    }
    clientOptions := options.Client().ApplyURI(uri)
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil {
    log.Fatalf("failed to connect to the server: %v", err)
    }
    defer func() { _ = client.Disconnect(ctx) }()
    // Specify the database and collection
    coll := client.Database("rag_db").Collection("test")
    indexName := "vector_index"
    opts := options.SearchIndexes().SetName(indexName).SetType("vectorSearch")
    type vectorDefinitionField struct {
    Type string `bson:"type"`
    Path string `bson:"path"`
    NumDimensions int `bson:"numDimensions"`
    Similarity string `bson:"similarity"`
    }
    type filterField struct {
    Type string `bson:"type"`
    Path string `bson:"path"`
    }
    type vectorDefinition struct {
    Fields []vectorDefinitionField `bson:"fields"`
    }
    indexModel := mongo.SearchIndexModel{
    Definition: vectorDefinition{
    Fields: []vectorDefinitionField{{
    Type: "vector",
    Path: "embedding",
    NumDimensions: 1024,
    Similarity: "cosine"}},
    },
    Options: opts,
    }
    log.Println("Creating the index.")
    searchIndexName, err := coll.SearchIndexes().CreateOne(ctx, indexModel)
    if err != nil {
    log.Fatalf("failed to create the search index: %v", err)
    }
    // Await the creation of the index.
    log.Println("Polling to confirm successful index creation.")
    log.Println("NOTE: This may take up to a minute.")
    searchIndexes := coll.SearchIndexes()
    var doc bson.Raw
    for doc == nil {
    cursor, err := searchIndexes.List(ctx, options.SearchIndexes().SetName(searchIndexName))
    if err != nil {
    log.Printf("failed to list search indexes: %w", err)
    }
    if !cursor.Next(ctx) {
    break
    }
    name := cursor.Current.Lookup("name").StringValue()
    queryable := cursor.Current.Lookup("queryable").Boolean()
    if name == searchIndexName && queryable {
    doc = cursor.Current
    } else {
    time.Sleep(5 * time.Second)
    }
    }
    log.Println("Name of Index Created: " + searchIndexName)
    }
  2. 运行以下命令以创建索引:

    go run rag-vector-index.go
  3. 定义一个函数来检索相关数据。

    在此步骤中,您将创建一个名为 GetQueryResults 的检索函数,用于运行查询以检索相关文档。它使用 GetEmbeddings 函数从搜索查询中创建嵌入。然后,它运行查询以返回语义相似的文档。

    要了解更多信息,请参阅运行向量搜索查询

    common 目录中,创建一个名为 get-query-results.go 的新文件,并将以下代码粘贴到其中:

    get-query-results.go
    package common
    import (
    "context"
    "log"
    "os"
    "github.com/joho/godotenv"
    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
    )
    type TextWithScore struct {
    PageContent string `bson:"pageContent"`
    Score float64 `bson:"score"`
    }
    func GetQueryResults(query string) []TextWithScore {
    ctx := context.Background()
    if err := godotenv.Load(); err != nil {
    log.Fatal("no .env file found")
    }
    // Connect to your Atlas cluster
    uri := os.Getenv("ATLAS_CONNECTION_STRING")
    if uri == "" {
    log.Fatal("set your 'ATLAS_CONNECTION_STRING' environment variable.")
    }
    clientOptions := options.Client().ApplyURI(uri)
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil {
    log.Fatalf("failed to connect to the server: %v", err)
    }
    defer func() { _ = client.Disconnect(ctx) }()
    // Specify the database and collection
    coll := client.Database("rag_db").Collection("test")
    queryEmbedding := GetEmbeddings([]string{query})
    vectorSearchStage := bson.D{
    {"$vectorSearch", bson.D{
    {"index", "vector_index"},
    {"path", "embedding"},
    {"queryVector", queryEmbedding[0]},
    {"exact", true},
    {"limit", 5},
    }}}
    projectStage := bson.D{
    {"$project", bson.D{
    {"_id", 0},
    {"pageContent", 1},
    {"score", bson.D{{"$meta", "vectorSearchScore"}}},
    }}}
    cursor, err := coll.Aggregate(ctx, mongo.Pipeline{vectorSearchStage, projectStage})
    if err != nil {
    log.Fatalf("failed to execute the aggregation pipeline: %v", err)
    }
    var results []TextWithScore
    if err = cursor.All(context.TODO(), &results); err != nil {
    log.Fatalf("failed to connect unmarshal retrieved documents: %v", err)
    }
    return results
    }
  4. 测试检索数据。

    1. rag-mongodb 项目目录中,创建一个名为 retrieve-documents-test.go 的新文件。在此步骤中,您将检查刚刚定义的函数是否返回相关结果。

    2. 将此代码粘贴到您的文件中:

      retrieve-documents-test.go
      package main
      import (
      "fmt"
      "rag-mongodb/common" // Module that contains the GetQueryResults function
      )
      func main() {
      query := "AI Technology"
      documents := common.GetQueryResults(query)
      for _, doc := range documents {
      fmt.Printf("Text: %s \nScore: %v \n\n", doc.PageContent, doc.Score)
      }
      }
    3. 运行以下命令来执行代码:

      go run retrieve-documents-test.go
      Text: for the variety and scale of data required by AI-powered applications. We are confident MongoDB will be a substantial beneficiary of this next wave of application development.&#34;
      Score: 0.835033655166626
      Text: &#34;As we look ahead, we continue to be incredibly excited by our large market opportunity, the potential to increase share, and become a standard within more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these applications. MongoDB&#39;s document-based architecture is particularly well-suited for the variety and
      Score: 0.8280757665634155
      Text: to the use of new and evolving technologies, such as artificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate that market; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability to maintain the security of our software
      Score: 0.8165900111198425
      Text: MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP), which provides customers with reference architectures, pre-built partner integrations, and professional services to help them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects, and is the first global systems
      Score: 0.8023912906646729
      Text: Bendigo and Adelaide Bank partnered with MongoDB to modernize their core banking technology. With the help of MongoDB Relational Migrator and generative AI-powered modernization tools, Bendigo and Adelaide Bank decomposed an outdated consumer-servicing application into microservices and migrated off its underlying legacy relational database technology significantly faster and more easily than
      Score: 0.7959681749343872
5

在本部分中,您将通过提示 LLM 将检索到的文档用作上下文来生成响应。此示例使用您刚刚定义的函数从数据库中检索匹配的文档,此外:

  • 从 Hugging Face 的模型中心访问 Mistral 7B Instruct 模型。

  • 指示 LLM 在提示中包含用户的问题和检索到的文件。

  • LLM 提示有关 MongoDB 最新的 AI 公告。

  1. 创建一个名为 generate-responses.go 的新文件,并粘贴以下代码:

    generate-responses.go
    package main
    import (
    "context"
    "fmt"
    "log"
    "rag-mongodb/common" // Module that contains the GetQueryResults function
    "strings"
    "github.com/tmc/langchaingo/llms"
    "github.com/tmc/langchaingo/llms/huggingface"
    "github.com/tmc/langchaingo/prompts"
    )
    func main() {
    ctx := context.Background()
    query := "AI Technology"
    documents := common.GetQueryResults(query)
    var textDocuments strings.Builder
    for _, doc := range documents {
    textDocuments.WriteString(doc.PageContent)
    }
    question := "In a few sentences, what are MongoDB's latest AI announcements?"
    template := prompts.NewPromptTemplate(
    `Answer the following question based on the given context.
    Question: {{.question}}
    Context: {{.context}}`,
    []string{"question", "context"},
    )
    prompt, err := template.Format(map[string]any{
    "question": question,
    "context": textDocuments.String(),
    })
    opts := llms.CallOptions{
    Model: "mistralai/Mistral-7B-Instruct-v0.3",
    MaxTokens: 150,
    Temperature: 0.1,
    }
    llm, err := huggingface.New(huggingface.WithModel("mistralai/Mistral-7B-Instruct-v0.3"))
    if err != nil {
    log.Fatalf("failed to initialize a Hugging Face LLM: %v", err)
    }
    completion, err := llms.GenerateFromSinglePrompt(ctx, llm, prompt, llms.WithOptions(opts))
    if err != nil {
    log.Fatalf("failed to generate a response from the prompt: %v", err)
    }
    response := strings.Split(completion, "\n\n")
    if len(response) == 2 {
    fmt.Printf("Prompt: %v\n\n", response[0])
    fmt.Printf("Response: %v\n", response[1])
    }
    }
  2. 运行此命令以执行代码。产生的响应可能会有所不同。

    go run generate-responses.go
    Prompt: Answer the following question based on the given context.
    Question: In a few sentences, what are MongoDB's latest AI announcements?
    Context: for the variety and scale of data required by AI-powered applications. We are confident MongoDB will be a substantial beneficiary of this next wave of application development.&#34;&#34;As we look ahead, we continue to be incredibly excited by our large market opportunity, the potential to increase share, and become a standard within more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these applications. MongoDB&#39;s document-based architecture is particularly well-suited for the variety andto the use of new and evolving technologies, such as artificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate that market; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability to maintain the security of our softwareMongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP), which provides customers with reference architectures, pre-built partner integrations, and professional services to help them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects, and is the first global systemsBendigo and Adelaide Bank partnered with MongoDB to modernize their core banking technology. With the help of MongoDB Relational Migrator and generative AI-powered modernization tools, Bendigo and Adelaide Bank decomposed an outdated consumer-servicing application into microservices and migrated off its underlying legacy relational database technology significantly faster and more easily than expected.
    Response: MongoDB's latest AI announcements include the launch of the MongoDB AI Applications Program (MAAP) and a partnership with Accenture to establish a center of excellence focused on MongoDB projects. Additionally, Bendigo and Adelaide Bank have partnered with MongoDB to modernize their core banking technology using MongoDB's AI-powered modernization tools.
1
  1. 在 IDE 中,使用 Maven 或 Gradle 创建Java项目。

  2. 根据您的包管理器,添加以下依赖项:

    如果使用 Maven,请将以下依赖项添加到项目的 pom.xml文件中的 dependencies大量,并将物料清单 (BOM) 添加到 dependencyManagement大量:

    pom.xml
    <dependencies>
    <!-- MongoDB Java Sync Driver v5.2.0 or later -->
    <dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongodb-driver-sync</artifactId>
    <version>[5.2.0,)</version>
    </dependency>
    <!-- Java library for Hugging Face models -->
    <dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-hugging-face</artifactId>
    </dependency>
    <!-- Java library for URL Document Loader -->
    <dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j</artifactId>
    </dependency>
    <!-- Java library for ApachePDFBox Document Parser -->
    <dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-document-parser-apache-pdfbox</artifactId>
    </dependency>
    </dependencies>
    <dependencyManagement>
    <dependencies>
    <!-- Bill of Materials (BOM) to manage Java library versions -->
    <dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-bom</artifactId>
    <version>0.36.2</version>
    <type>pom</type>
    <scope>import</scope>
    </dependency>
    </dependencies>
    </dependencyManagement>

    如果您使用 Gradle,请将以下物料清单 (BOM) 和依赖项添加到项目的 build.gradle文件中的 dependencies大量:

    build.gradle
    dependencies {
    // Bill of Materials (BOM) to manage Java library versions
    implementation platform('dev.langchain4j:langchain4j-bom:0.36.2')
    // MongoDB Java Sync Driver v5.2.0 or later
    implementation 'org.mongodb:mongodb-driver-sync:5.2.0'
    // Java library for Hugging Face models
    implementation 'dev.langchain4j:langchain4j-hugging-face'
    // Java library for URL Document Loader
    implementation 'dev.langchain4j:langchain4j'
    // Java library for Apache PDFBox Document Parser
    implementation 'dev.langchain4j:langchain4j-document-parser-apache-pdfbox'
    }
  3. 运行包管理器以安装项目。

2

注意

此示例在 IDE 中设置项目的变量。生产应用程序可以通过部署配置、CI/CD管道或密钥管理器来管理环境变量,但您可以调整提供的代码以适合您的使用案例。

在 IDE 中,创建新的配置模板并将以下变量添加到项目中:

  • 如果您使用的是 IntelliJ IDEA,请创建新的 Application运行配置模板,然后在 Environment variables字段中将变量添加为分号分隔的值(示例FOO=123;BAR=456 )。应用更改并单击 OK

    要学习;了解更多信息,请参阅 IntelliJ IDEA 文档的从模板创建运行/调试配置部分。

  • 如果您使用的是 Eclipse,请创建新的 Java Application 启动配置,然后将每个变量作为新的键值对添加到 Environment标签页中。应用更改并单击 OK

    要学习;了解更多信息,请参阅 Eclipse IDE 文档的创建Java应用程序启动配置部分。

环境变量
HUGGING_FACE_ACCESS_TOKEN=<access-token>
ATLAS_CONNECTION_STRING=<connection-string>

使用以下值更新占位符:

  • <access-token> 占位符值替换为您的 Huging Face访问权限令牌。

  • 用 Atlas 集群的 SRV 连接字符串替换 <connection-string> 占位符值。

    连接字符串应使用以下格式:

    mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
3

创建一个名为PDFProcessor.java的文件并粘贴以下代码。

此代码定义了以下方法:

PDFProcessor.java
import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.DocumentParser;
import dev.langchain4j.data.document.DocumentSplitter;
import dev.langchain4j.data.document.loader.UrlDocumentLoader;
import dev.langchain4j.data.document.parser.apache.pdfbox.ApachePdfBoxDocumentParser;
import dev.langchain4j.data.document.splitter.DocumentByCharacterSplitter;
import dev.langchain4j.data.segment.TextSegment;
import java.util.List;
public class PDFProcessor {
/** Parses a PDF document from the specified URL, and returns a
* langchain4j Document object.
* */
public static Document parsePDFDocument(String url) {
DocumentParser parser = new ApachePdfBoxDocumentParser();
return UrlDocumentLoader.load(url, parser);
}
/** Splits a parsed langchain4j Document based on the specified chunking
* parameters, and returns an array of text segments.
*/
public static List<TextSegment> splitDocument(Document document) {
int maxChunkSize = 400; // number of characters
int maxChunkOverlap = 20; // number of overlapping characters between consecutive chunks
DocumentSplitter splitter = new DocumentByCharacterSplitter(maxChunkSize, maxChunkOverlap);
return splitter.split(document);
}
}
4

创建一个名为EmbeddingProvider.java的文件并粘贴以下代码。

此代码定义了两种使用 mxbai-embed-large-v1 开源嵌入模型为给定输入生成嵌入的方法:

  • 多个输入:getEmbeddings 方法接受文本段输入大量(List<TextSegment> ),允许您在单个API调用中创建多个嵌入。该方法将API提供的浮点数数组转换为BSON双精度数组,以便存储在Atlas 集群中。

  • 单个输入:getEmbedding 方法接受单个String ,它表示要对向量数据进行的查询。该方法将API提供的浮点数大量转换为BSON双精度大量,以便在查询集合时使用。

EmbeddingProvider.java
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.huggingface.HuggingFaceChatModel;
import dev.langchain4j.model.huggingface.HuggingFaceEmbeddingModel;
import dev.langchain4j.model.output.Response;
import org.bson.BsonArray;
import org.bson.BsonDouble;
import java.util.List;
import static java.time.Duration.ofSeconds;
public class EmbeddingProvider {
private static HuggingFaceEmbeddingModel embeddingModel;
private static HuggingFaceEmbeddingModel getEmbeddingModel() {
if (embeddingModel == null) {
String accessToken = System.getenv("HUGGING_FACE_ACCESS_TOKEN");
if (accessToken == null || accessToken.isEmpty()) {
throw new RuntimeException("HUGGING_FACE_ACCESS_TOKEN env variable is not set or is empty.");
}
embeddingModel = HuggingFaceEmbeddingModel.builder()
.accessToken(accessToken)
.modelId("mixedbread-ai/mxbai-embed-large-v1")
.waitForModel(true)
.timeout(ofSeconds(60))
.build();
}
return embeddingModel;
}
/**
* Returns the Hugging Face chat model interface used by the createPrompt() method
* to process queries and generate responses.
*/
private static HuggingFaceChatModel chatModel;
public static HuggingFaceChatModel getChatModel() {
String accessToken = System.getenv("HUGGING_FACE_ACCESS_TOKEN");
if (accessToken == null || accessToken.isEmpty()) {
throw new IllegalStateException("HUGGING_FACE_ACCESS_TOKEN env variable is not set or is empty.");
}
if (chatModel == null) {
chatModel = HuggingFaceChatModel.builder()
.timeout(ofSeconds(25))
.modelId("mistralai/Mistral-7B-Instruct-v0.3")
.temperature(0.1)
.maxNewTokens(150)
.accessToken(accessToken)
.waitForModel(true)
.build();
}
return chatModel;
}
/**
* Takes an array of text segments and returns a BSON array of embeddings to
* store in the database.
*/
public List<BsonArray> getEmbeddings(List<TextSegment> texts) {
List<TextSegment> textSegments = texts.stream()
.toList();
Response<List<Embedding>> response = getEmbeddingModel().embedAll(textSegments);
return response.content().stream()
.map(e -> new BsonArray(
e.vectorAsList().stream()
.map(BsonDouble::new)
.toList()))
.toList();
}
/**
* Takes a single string and returns a BSON array embedding to
* use in a vector query.
*/
public static BsonArray getEmbedding(String text) {
Response<Embedding> response = getEmbeddingModel().embed(text);
return new BsonArray(
response.content().vectorAsList().stream()
.map(BsonDouble::new)
.toList());
}
}
5

创建一个名为DataIngest.java的文件并粘贴以下代码。

此代码使用 LangChain4 j 库和MongoDB Java同步驱动程序将 LLM 无权访问权限的示例数据提取到Atlas中。

具体而言,此代码执行以下操作:

  1. 连接到您的 Atlas 集群。

  2. 使用您之前定义的 方法从URL加载并解析MongoDB收益报告 PDF文件。parsePDFDocument

  3. 使用您之前定义的 splitDocument 方法将数据拆分为数据段。

  4. 使用您之前定义的 GetEmbeddings 方法从分块数据创建向量嵌入。

  5. 将嵌入与分块数据一起存储在Atlas 集群的 rag_db.test集合中。

    DataIngest.java
    import com.mongodb.MongoException;
    import com.mongodb.client.MongoClient;
    import com.mongodb.client.MongoClients;
    import com.mongodb.client.MongoCollection;
    import com.mongodb.client.MongoDatabase;
    import com.mongodb.client.result.InsertManyResult;
    import dev.langchain4j.data.segment.TextSegment;
    import org.bson.BsonArray;
    import org.bson.Document;
    import java.util.ArrayList;
    import java.util.List;
    public class DataIngest {
    public static void main(String[] args) {
    String uri = System.getenv("ATLAS_CONNECTION_STRING");
    if (uri == null || uri.isEmpty()) {
    throw new RuntimeException("ATLAS_CONNECTION_STRING env variable is not set or is empty.");
    }
    // establish connection and set namespace
    try (MongoClient mongoClient = MongoClients.create(uri)) {
    MongoDatabase database = mongoClient.getDatabase("rag_db");
    MongoCollection<Document> collection = database.getCollection("test");
    // parse the PDF file at the specified URL
    String url = "https://investors.mongodb.com/node/12236/pdf";
    String fileName = "mongodb_annual_report.pdf";
    System.out.println("Parsing the [" + fileName + "] file from url: " + url);
    dev.langchain4j.data.document.Document parsedDoc = PDFProcessor.parsePDFDocument(url);
    // split (or "chunk") the parsed document into text segments
    List<TextSegment> segments = PDFProcessor.splitDocument(parsedDoc);
    System.out.println(segments.size() + " text segments created successfully.");
    // create vector embeddings from the chunked data (i.e. text segments)
    System.out.println("Creating vector embeddings from the parsed data segments. This may take a few moments.");
    List<Document> documents = embedText(segments);
    // insert the embeddings into the Atlas collection
    try {
    System.out.println("Ingesting data into the " + collection.getNamespace() + " collection.");
    insertDocuments(documents, collection);
    }
    catch (MongoException me) {
    throw new RuntimeException("Failed to insert documents", me);
    }
    } catch (MongoException me) {
    throw new RuntimeException("Failed to connect to MongoDB", me);
    } catch (Exception e) {
    throw new RuntimeException("Operation failed: ", e);
    }
    }
    /**
    * Embeds text segments into vector embeddings using the EmbeddingProvider
    * class and returns a list of BSON documents containing the text and
    * generated embeddings.
    */
    private static List<Document> embedText(List<TextSegment> segments) {
    EmbeddingProvider embeddingProvider = new EmbeddingProvider();
    List<BsonArray> embeddings = embeddingProvider.getEmbeddings(segments);
    List<Document> documents = new ArrayList<>();
    int i = 0;
    for (TextSegment segment : segments) {
    Document doc = new Document("text", segment.text()).append("embedding", embeddings.get(i));
    documents.add(doc);
    i++;
    }
    return documents;
    }
    /**
    * Inserts a list of BSON documents into the specified MongoDB collection.
    */
    private static void insertDocuments(List<Document> documents, MongoCollection<Document> collection) {
    List<String> insertedIds = new ArrayList<>();
    InsertManyResult result = collection.insertMany(documents);
    result.getInsertedIds().values()
    .forEach(doc -> insertedIds.add(doc.toString()));
    System.out.println(insertedIds.size() + " documents inserted into the " + collection.getNamespace() + " collection successfully.");
    }
    }
6

注意

503 调用 Hushing Face 模型时

在调用 Hugging Face 模型中心模型时,您偶尔可能会遇到 503 错误。要解决此问题,请在短暂等待后重试。

保存并运行DataIngest.java文件。输出类似于:

Parsing the [mongodb_annual_report.pdf] file from url: https://investors.mongodb.com/node/12236/pdf
72 text segments created successfully.
Creating vector embeddings from the parsed data segments. This may take a few moments...
Ingesting data into the rag_db.test collection.
72 documents inserted into the rag_db.test collection successfully.
7

在本部分中,您设立Atlas Vector Search以从向量数据库中检索文档。

  1. 创建一个名为VectorIndex.java的文件并粘贴以下代码。

    此代码使用以下索引定义在集合上创建Atlas Vector Search索引:

    • embeddingrag_db.test 集合的向量索引类型中为 字段编制索引。该字段包含使用嵌入模型创建的嵌入。

    • 强制执行 1024 向量维度,并使用 cosine 衡量向量之间的相似性。

    VectorIndex.java
    import com.mongodb.MongoException;
    import com.mongodb.client.ListSearchIndexesIterable;
    import com.mongodb.client.MongoClient;
    import com.mongodb.client.MongoClients;
    import com.mongodb.client.MongoCollection;
    import com.mongodb.client.MongoCursor;
    import com.mongodb.client.MongoDatabase;
    import com.mongodb.client.model.SearchIndexModel;
    import com.mongodb.client.model.SearchIndexType;
    import org.bson.Document;
    import org.bson.conversions.Bson;
    import java.util.Collections;
    import java.util.List;
    public class VectorIndex {
    public static void main(String[] args) {
    String uri = System.getenv("ATLAS_CONNECTION_STRING");
    if (uri == null || uri.isEmpty()) {
    throw new IllegalStateException("ATLAS_CONNECTION_STRING env variable is not set or is empty.");
    }
    // establish connection and set namespace
    try (MongoClient mongoClient = MongoClients.create(uri)) {
    MongoDatabase database = mongoClient.getDatabase("rag_db");
    MongoCollection<Document> collection = database.getCollection("test");
    // define the index details for the index model
    String indexName = "vector_index";
    Bson definition = new Document(
    "fields",
    Collections.singletonList(
    new Document("type", "vector")
    .append("path", "embedding")
    .append("numDimensions", 1024)
    .append("similarity", "cosine")));
    SearchIndexModel indexModel = new SearchIndexModel(
    indexName,
    definition,
    SearchIndexType.vectorSearch());
    // create the index using the defined model
    try {
    List<String> result = collection.createSearchIndexes(Collections.singletonList(indexModel));
    System.out.println("Successfully created vector index named: " + result);
    System.out.println("It may take up to a minute for the index to build before you can query using it.");
    } catch (Exception e) {
    throw new RuntimeException(e);
    }
    // wait for Atlas to build the index and make it queryable
    System.out.println("Polling to confirm the index has completed building.");
    waitForIndexReady(collection, indexName);
    } catch (MongoException me) {
    throw new RuntimeException("Failed to connect to MongoDB", me);
    } catch (Exception e) {
    throw new RuntimeException("Operation failed: ", e);
    }
    }
    /**
    * Polls the collection to check whether the specified index is ready to query.
    */
    public static void waitForIndexReady(MongoCollection<Document> collection, String indexName) throws InterruptedException {
    ListSearchIndexesIterable<Document> searchIndexes = collection.listSearchIndexes();
    while (true) {
    try (MongoCursor<Document> cursor = searchIndexes.iterator()) {
    if (!cursor.hasNext()) {
    break;
    }
    Document current = cursor.next();
    String name = current.getString("name");
    boolean queryable = current.getBoolean("queryable");
    if (name.equals(indexName) && queryable) {
    System.out.println(indexName + " index is ready to query");
    return;
    } else {
    Thread.sleep(500);
    }
    }
    }
    }
    }
  2. 创建 Atlas Vector Search 索引。

    保存并运行该文件。 输出类似如下所示:

    Successfully created a vector index named: [vector_index]
    Polling to confirm the index has completed building.
    It may take up to a minute for the index to build before you can query using it.
    vector_index index is ready to query
8

在本节中,您将通过提示 LLM 使用检索到的文档作为上下文来生成响应。

创建一个名为 LLMPrompt.java 的新文件,并将以下代码粘贴到其中。

此代码执行以下操作:

  1. 使用 retrieveDocuments 方法查询 rag_db.test集合中是否有任何匹配的文档。

    此方法使用您之前创建的 getEmbedding 方法从搜索查询生成嵌入,然后运行查询以返回语义相似的文档。

    要了解更多信息,请参阅运行向量搜索查询

  2. 从 Huugging Face 的模型中心访问 Mistral7 B Instruct 模型,并使用createPrompt 方法创建模板化提示。

    该方法指示 LLM 将用户的问题和检索到的文档包含在定义的提示中。

  3. 向法学硕士提示 MongoDB 的最新AI公告,然后返回生成的响应。

    LLMPrompt.java
    import com.mongodb.MongoException;
    import com.mongodb.client.MongoClient;
    import com.mongodb.client.MongoClients;
    import com.mongodb.client.MongoCollection;
    import com.mongodb.client.MongoDatabase;
    import com.mongodb.client.model.search.FieldSearchPath;
    import dev.langchain4j.data.message.AiMessage;
    import dev.langchain4j.model.huggingface.HuggingFaceChatModel;
    import dev.langchain4j.model.input.Prompt;
    import dev.langchain4j.model.input.PromptTemplate;
    import org.bson.BsonArray;
    import org.bson.BsonValue;
    import org.bson.Document;
    import org.bson.conversions.Bson;
    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    import static com.mongodb.client.model.Aggregates.project;
    import static com.mongodb.client.model.Aggregates.vectorSearch;
    import static com.mongodb.client.model.Projections.exclude;
    import static com.mongodb.client.model.Projections.fields;
    import static com.mongodb.client.model.Projections.include;
    import static com.mongodb.client.model.Projections.metaVectorSearchScore;
    import static com.mongodb.client.model.search.SearchPath.fieldPath;
    import static com.mongodb.client.model.search.VectorSearchOptions.exactVectorSearchOptions;
    import static java.util.Arrays.asList;
    public class LLMPrompt {
    // User input: the question to answer
    static String question = "In a few sentences, what are MongoDB's latest AI announcements?";
    public static void main(String[] args) {
    String uri = System.getenv("ATLAS_CONNECTION_STRING");
    if (uri == null || uri.isEmpty()) {
    throw new IllegalStateException("ATLAS_CONNECTION_STRING env variable is not set or is empty.");
    }
    // establish connection and set namespace
    try (MongoClient mongoClient = MongoClients.create(uri)) {
    MongoDatabase database = mongoClient.getDatabase("rag_db");
    MongoCollection<Document> collection = database.getCollection("test");
    // generate a response to the user question
    try {
    createPrompt(question, collection);
    } catch (Exception e) {
    throw new RuntimeException("An error occurred while generating the response: ", e);
    }
    } catch (MongoException me) {
    throw new RuntimeException("Failed to connect to MongoDB ", me);
    } catch (Exception e) {
    throw new RuntimeException("Operation failed: ", e);
    }
    }
    /**
    * Returns a list of documents from the specified MongoDB collection that
    * match the user's question.
    * NOTE: Update or omit the projection stage to change the desired fields in the response
    */
    public static List<Document> retrieveDocuments(String question, MongoCollection<Document> collection) {
    try {
    // generate the query embedding to use in the vector search
    BsonArray queryEmbeddingBsonArray = EmbeddingProvider.getEmbedding(question);
    List<Double> queryEmbedding = new ArrayList<>();
    for (BsonValue value : queryEmbeddingBsonArray.stream().toList()) {
    queryEmbedding.add(value.asDouble().getValue());
    }
    // define the pipeline stages for the vector search index
    String indexName = "vector_index";
    FieldSearchPath fieldSearchPath = fieldPath("embedding");
    int limit = 5;
    List<Bson> pipeline = asList(
    vectorSearch(
    fieldSearchPath,
    queryEmbedding,
    indexName,
    limit,
    exactVectorSearchOptions()),
    project(
    fields(
    exclude("_id"),
    include("text"),
    metaVectorSearchScore("score"))));
    // run the query and return the matching documents
    List<Document> matchingDocuments = new ArrayList<>();
    collection.aggregate(pipeline).forEach(matchingDocuments::add);
    return matchingDocuments;
    } catch (Exception e) {
    System.err.println("Error occurred while retrieving documents: " + e.getMessage());
    return new ArrayList<>();
    }
    }
    /**
    * Creates a templated prompt from a submitted question string and any retrieved documents,
    * then generates a response using the Hugging Face chat model.
    */
    public static void createPrompt(String question, MongoCollection<Document> collection) {
    // retrieve documents matching the user's question
    List<Document> retrievedDocuments = retrieveDocuments(question, collection);
    if (retrievedDocuments.isEmpty()) {
    System.out.println("No relevant documents found. Unable to generate a response.");
    return;
    } else
    System.out.println("Generating a response from the retrieved documents. This may take a few moments.");
    // define a prompt template
    HuggingFaceChatModel huggingFaceChatModel = EmbeddingProvider.getChatModel();
    PromptTemplate promptBuilder = PromptTemplate.from("""
    Answer the following question based on the given context:
    Question: {{question}}
    Context: {{information}}
    -------
    """);
    // build the information string from the retrieved documents
    StringBuilder informationBuilder = new StringBuilder();
    for (Document doc : retrievedDocuments) {
    String text = doc.getString("text");
    informationBuilder.append(text).append("\n");
    }
    Map<String, Object> variables = new HashMap<>();
    variables.put("question", question);
    variables.put("information", informationBuilder);
    // generate and output the response from the chat model
    Prompt prompt = promptBuilder.apply(variables);
    AiMessage response = huggingFaceChatModel.generate(prompt.toUserMessage()).content();
    // extract the generated text to output a formatted response
    String responseText = response.text();
    String marker = "-------";
    int markerIndex = responseText.indexOf(marker);
    String generatedResponse;
    if (markerIndex != -1) {
    generatedResponse = responseText.substring(markerIndex + marker.length()).trim();
    } else {
    generatedResponse = responseText; // else fallback to the full response
    }
    // output the question and formatted response
    System.out.println("Question:\n " + question);
    System.out.println("Response:\n " + generatedResponse);
    // output the filled-in prompt and context information for demonstration purposes
    System.out.println("\n" + "---- Prompt Sent to LLM ----");
    System.out.println(prompt.text() + "\n");
    }
    }
9

保存并运行文件。输出如下所示,但请注意,生成的响应可能会有所不同。

Generating a response from the retrieved documents. This may take a few moments.
Question:
In a few sentences, what are MongoDB's latest AI announcements?
Response:
MongoDB's latest AI announcements include the MongoDB AI Applications Program (MAAP), which provides customers with reference architectures, pre-built partner integrations, and professional services to help them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects. These announcements highlight MongoDB's growing focus on AI application development and its potential to modernize legacy workloads.
---- Prompt Sent to LLM ----
Answer the following question based on the given context:
Question: In a few sentences, what are MongoDB's latest AI announcements?
Context: time data.
MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP),
which provides customers with reference architectures, pre-built partner integrations, and professional services to help
them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects,
and is the first global systems i
ighlights
MongoDB announced a number of new products and capabilities at MongoDB.local NYC. Highlights included the preview
of MongoDB 8.0—with significant performance improvements such as faster reads and updates, along with significantly
faster bulk inserts and time series queries—and the general availability of Atlas Stream Processing to build sophisticated,
event-driven applications with real-
ble future as well as the criticality of MongoDB to artificial intelligence application development. These forward-looking
statements include, but are not limited to, plans, objectives, expectations and intentions and other statements contained in this press release that are
not historical facts and statements identified by words such as "anticipate," "believe," "continue," "could," "estimate," "e
ve Officer of MongoDB.
"As we look ahead, we continue to be incredibly excited by our large market opportunity, the potential to increase share, and become a standard within
more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these
applications. MongoDB's document-based architecture is particularly well-suited for t
ictable, impact on its future GAAP financial results.
Conference Call Information
MongoDB will host a conference call today, May 30, 2024, at 5:00 p.m. (Eastern Time) to discuss its financial results and business outlook. A live
webcast of the call will be available on the "Investor Relations" page of MongoDB's website at https://investors.mongodb.com. To access the call by
phone, please go to thi
1
  1. 初始化您的 Node.js 项目。

    在终端中运行以下命令,创建名为 rag-mongodb 的新目录并初始化项目:

    mkdir rag-mongodb
    cd rag-mongodb
    npm init -y
  2. 安装并导入依赖项。

    运行以下命令:

    npm install mongodb langchain @langchain/community @xenova/transformers @huggingface/inference pdf-parse
  3. 更新您的 package.json 文件。

    在项目的package.json文件中,按以下示例所示指定type字段,然后保存文件。

    {
    "name": "rag-mongodb",
    "type": "module",
    ...
  4. 创建.env文件。

    在项目中,创建 .env 文件来存储 Atlas 连接字符串和 Hugging Face 访问令牌。

    HUGGING_FACE_ACCESS_TOKEN = "<access-token>"
    ATLAS_CONNECTION_STRING = "<connection-string>"
    Replace the ``<access-token>`` placeholder value with your Hugging Face access token.
    .. include:: /includes/avs-examples/shared/avs-replace-connection-string.rst

    注意

    最低 Node.js 版本要求

    Node.js v 20 .x 引入了 --env-file 选项。如果您使用的是旧版本的 Node.js,请将 dotenv 包添加到项目中,或使用其他方法来管理环境变量。

2

在本部分中,您将创建一个函数,该函数:

  • 从 Hugging Face 的模型中心加载 nomic-embed-text-v1 嵌入模型。

  • 从输入的数据创建向量嵌入。

在您的项目中创建一个名为 get-embeddings.js 的文件,并将以下代码粘贴到该文件中:

import { pipeline } from '@xenova/transformers';
// Function to generate embeddings for a given data source
export async function getEmbedding(data) {
const embedder = await pipeline(
'feature-extraction',
'Xenova/nomic-embed-text-v1');
const results = await embedder(data, { pooling: 'mean', normalize: true });
return Array.from(results.data);
}
3

在本部分中,您将 LLM 无法访问的示例数据导入 Atlas。以下代码使用 LangChain 集成Node.js 驱动程序执行以下操作:

  • 加载包含 MongoDB 收益报告的 PDF。

  • 将数据拆分为数据段,并指定数据块大小(字符数)和数据段数据块重叠(连续数据块之间的重叠字符数)。

  • 使用您定义的 getEmbeddings 函数从分块数据创建向量嵌入。

  • 将这些嵌入与分块数据一起存储在 Atlas 集群的 rag_db.test 集合中。

在您的项目中创建一个名为 ingest-data.js 的文件,并将以下代码粘贴到该文件中:

import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MongoClient } from 'mongodb';
import { getEmbeddings } from './get-embeddings.js';
import * as fs from 'fs';
async function run() {
const client = new MongoClient(process.env.ATLAS_CONNECTION_STRING);
try {
// Save online PDF as a file
const rawData = await fetch("https://investors.mongodb.com/node/12236/pdf");
const pdfBuffer = await rawData.arrayBuffer();
const pdfData = Buffer.from(pdfBuffer);
fs.writeFileSync("investor-report.pdf", pdfData);
const loader = new PDFLoader(`investor-report.pdf`);
const data = await loader.load();
// Chunk the text from the PDF
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 400,
chunkOverlap: 20,
});
const docs = await textSplitter.splitDocuments(data);
console.log(`Successfully chunked the PDF into ${docs.length} documents.`);
// Connect to your Atlas cluster
await client.connect();
const db = client.db("rag_db");
const collection = db.collection("test");
console.log("Generating embeddings and inserting documents.");
let docCount = 0;
await Promise.all(docs.map(async doc => {
const embeddings = await getEmbeddings(doc.pageContent);
// Insert the embeddings and the chunked PDF data into Atlas
await collection.insertOne({
document: doc,
embedding: embeddings,
});
docCount += 1;
}))
console.log(`Successfully inserted ${docCount} documents.`);
} catch (err) {
console.log(err.stack);
}
finally {
await client.close();
}
}
run().catch(console.dir);

然后,运行以下命令来执行代码:

node --env-file=.env ingest-data.js

提示

此代码需要一些时间才能运行。您可以通过导航到 Atlas UI 中的 rag_db.test 集合来查看插入的矢量嵌入。

4

在本节中,您将设置 Atlas Vector Search,以便从向量数据库中检索文档。然后完成以下步骤:

  1. 在向量嵌入上创建 Atlas Vector Search 索引。

    创建一个名为 rag-vector-index.js 的新文件并粘贴以下代码。此代码连接到您的 Atlas 集群并在 rag_db.test 集合上创建 vectorSearch 类型的索引。

    import { MongoClient } from 'mongodb';
    // Connect to your Atlas cluster
    const client = new MongoClient(process.env.ATLAS_CONNECTION_STRING);
    async function run() {
    try {
    const database = client.db("rag_db");
    const collection = database.collection("test");
    // Define your Atlas Vector Search index
    const index = {
    name: "vector_index",
    type: "vectorSearch",
    definition: {
    "fields": [
    {
    "type": "vector",
    "numDimensions": 768,
    "path": "embedding",
    "similarity": "cosine"
    }
    ]
    }
    }
    // Call the method to create the index
    const result = await collection.createSearchIndex(index);
    console.log(result);
    } finally {
    await client.close();
    }
    }
    run().catch(console.dir);

    然后,运行以下命令来执行代码:

    node --env-file=.env rag-vector-index.js
  2. 定义一个函数来检索相关数据。

    创建一个名为 retrieve-documents.js 的新文件。

    在此步骤中,您将创建一个名为 getQueryResults 的检索函数,用于运行查询以检索相关文档。它使用 getEmbeddings 函数从搜索查询中创建嵌入。然后,它运行查询以返回语义相似的文档。

    要了解更多信息,请参阅运行向量搜索查询

    将此代码粘贴到您的文件中:

    import { MongoClient } from 'mongodb';
    import { getEmbeddings } from './get-embeddings.js';
    // Function to get the results of a vector query
    export async function getQueryResults(query) {
    // Connect to your Atlas cluster
    const client = new MongoClient(process.env.ATLAS_CONNECTION_STRING);
    try {
    // Get embeddings for a query
    const queryEmbeddings = await getEmbeddings(query);
    await client.connect();
    const db = client.db("rag_db");
    const collection = db.collection("test");
    const pipeline = [
    {
    $vectorSearch: {
    index: "vector_index",
    queryVector: queryEmbeddings,
    path: "embedding",
    exact: true,
    limit: 5
    }
    },
    {
    $project: {
    _id: 0,
    document: 1,
    }
    }
    ];
    // Retrieve documents from Atlas using this Vector Search query
    const result = collection.aggregate(pipeline);
    const arrayOfQueryDocs = [];
    for await (const doc of result) {
    arrayOfQueryDocs.push(doc);
    }
    return arrayOfQueryDocs;
    } catch (err) {
    console.log(err.stack);
    }
    finally {
    await client.close();
    }
    }
  3. 测试检索数据。

    创建一个名为 retrieve-documents-test.js 的新文件。在此步骤中,您将检查刚刚定义的函数是否返回相关结果。

    将此代码粘贴到您的文件中:

    import { getQueryResults } from './retrieve-documents.js';
    async function run() {
    try {
    const query = "AI Technology";
    const documents = await getQueryResults(query);
    documents.forEach( doc => {
    console.log(doc);
    });
    } catch (err) {
    console.log(err.stack);
    }
    }
    run().catch(console.dir);

    然后,运行以下命令来执行代码:

    node --env-file=.env retrieve-documents-test.js
    {
    document: {
    pageContent: 'MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP),',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
    }
    }
    {
    document: {
    pageContent: 'artificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate that\n' +
    'market; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability to',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
    }
    }
    {
    document: {
    pageContent: 'more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these\n' +
    "applications. MongoDB's document-based architecture is particularly well-suited for the variety and scale of data required by AI-powered applications. \n" +
    'We are confident MongoDB will be a substantial beneficiary of this next wave of application development."',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
    }
    }
    {
    document: {
    pageContent: 'which provides customers with reference architectures, pre-built partner integrations, and professional services to help\n' +
    'them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects,\n' +
    'and is the first global systems integrator to join MAAP.',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
    }
    }
    {
    document: {
    pageContent: 'Bendigo and Adelaide Bank partnered with MongoDB to modernize their core banking technology. With the help of\n' +
    'MongoDB Relational Migrator and generative AI-powered modernization tools, Bendigo and Adelaide Bank decomposed an\n' +
    'outdated consumer-servicing application into microservices and migrated off its underlying legacy relational database',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
    }
    }
5

在本部分中,您将通过提示 LLM 将检索到的文档用作上下文来生成响应。此示例使用您刚刚定义的函数从数据库中检索匹配的文档,此外:

  • 从 Hugging Face 的模型中心访问 Mistral 7B Instruct 模型。

  • 指示 LLM 在提示中包含用户的问题和检索到的文件。

  • LLM 提示有关 MongoDB 最新的 AI 公告。

创建一个名为 generate-responses.js 的新文件,并粘贴以下代码:

import { getQueryResults } from './retrieve-documents.js';
import { HfInference } from '@huggingface/inference'
async function run() {
try {
// Specify search query and retrieve relevant documents
const query = "AI Technology";
const documents = await getQueryResults(query);
// Build a string representation of the retrieved documents to use in the prompt
let textDocuments = "";
documents.forEach(doc => {
textDocuments += doc.document.pageContent;
});
const question = "In a few sentences, what are MongoDB's latest AI announcements?";
// Create a prompt consisting of the question and context to pass to the LLM
const prompt = `Answer the following question based on the given context.
Question: {${question}}
Context: {${textDocuments}}
`;
// Connect to Hugging Face, using the access token from the environment file
const hf = new HfInference(process.env.HUGGING_FACE_ACCESS_TOKEN);
const llm = hf.endpoint(
"https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3"
);
// Prompt the LLM to answer the question using the
// retrieved documents as the context
const output = await llm.chatCompletion({
model: "mistralai/Mistral-7B-Instruct-v0.2",
messages: [{ role: "user", content: prompt }],
max_tokens: 150,
});
// Output the LLM's response as text.
console.log(output.choices[0].message.content);
} catch (err) {
console.log(err.stack);
}
}
run().catch(console.dir);

然后,运行此命令以执行代码。生成的响应可能会有所不同。

node --env-file=.env generate-responses.js
MongoDB's latest AI announcements include the launch of the MongoDB
AI Applications Program (MAAP), which provides customers with
reference architectures, pre-built partner integrations, and
professional services to help them build AI-powered applications
quickly. Accenture has joined MAAP as the first global systems
integrator, establishing a center of excellence focused on MongoDB
projects. Additionally, Bendigo and Adelaide Bank have partnered
with MongoDB to modernize their core banking technology using
MongoDB's Relational Migrator and generative AI-powered
modernization tools.
1

通过保存扩展名为 .ipynb 的文件来创建交互式 Python 笔记本。这个笔记本允许您单独运行 Python 代码片段。在笔记本中,运行以下代码来安装本教程的依赖项:

pip install --quiet --upgrade pymongo sentence_transformers einops langchain langchain_community pypdf huggingface_hub
2

在本节中,您将把 LLM 无法访问的示例数据摄入到 Atlas。在笔记本中粘贴并运行以下每个代码片段:

  1. 定义一个函数来生成向量嵌入。

    运行此代码以创建一个函数,该函数使用开源嵌入模型生成向量嵌入。具体而言,此代码执行以下操作:

    from sentence_transformers import SentenceTransformer
    # Load the embedding model (https://huggingface.co/nomic-ai/nomic-embed-text-v1")
    model = SentenceTransformer("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
    # Define a function to generate embeddings
    def get_embedding(data):
    """Generates vector embeddings for the given data."""
    embedding = model.encode(data)
    return embedding.tolist()
  2. 加载并分割数据。

    运行此代码以使用 LangChain 集成加载和分割样本数据。具体而言,此代码执行以下操作:

    • 加载包含 MongoDB 收益报告的 PDF。

    • 将数据拆分为数据块,并指定数据块大小(字符数)和数据块重叠(连续数据块之间的重叠字符数)。

    from langchain_community.document_loaders import PyPDFLoader
    from langchain.text_splitter import RecursiveCharacterTextSplitter
    # Load the PDF
    loader = PyPDFLoader("https://investors.mongodb.com/node/12236/pdf")
    data = loader.load()
    # Split the data into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=20)
    documents = text_splitter.split_documents(data)
  3. 将数据转换为向量嵌入。

    运行此代码,创建一个包含相应向量嵌入的文档列表,为摄取分块文件做好准备。您可以使用刚刚定义的get_embedding 函数生成这些嵌入。

    # Prepare documents for insertion
    docs_to_insert = [{
    "text": doc.page_content,
    "embedding": get_embedding(doc.page_content)
    } for doc in documents]
  4. 将数据和嵌入存储在 Atlas 中

    运行此代码,将包含嵌入内容的文档插入到 Atlas 集群的 rag_db.test 集合中。在运行代码之前,请将 <connection-string> 替换为您的 Atlas 连接字符串

    from pymongo import MongoClient
    # Connect to your Atlas cluster
    client = MongoClient("<connection-string>")
    collection = client["rag_db"]["test"]
    # Insert documents into the collection
    result = collection.insert_many(docs_to_insert)

    提示

    运行代码后,您可以导航到集群中的 rag_db.test 集合,在 Atlas UI 中查看向量嵌入。

3

在本节中,您将使用 Atlas Vector Search 创建一个检索系统,以从向量数据库中获取相关文档。在笔记本中粘贴并运行以下每个代码片段:

  1. 在向量嵌入上创建 Atlas Vector Search 索引。

    运行以下代码,使用 PyMongo 驱动程序直接从应用程序创建索引。此代码还包括一个轮询机制,用于检查索引是否已准备好使用。

    要了解详情,请参阅如何为向量搜索建立字段索引

    from pymongo.operations import SearchIndexModel
    import time
    # Create your index model, then create the search index
    index_name="vector_index"
    search_index_model = SearchIndexModel(
    definition = {
    "fields": [
    {
    "type": "vector",
    "numDimensions": 768,
    "path": "embedding",
    "similarity": "cosine"
    }
    ]
    },
    name = index_name,
    type = "vectorSearch"
    )
    collection.create_search_index(model=search_index_model)
    # Wait for initial sync to complete
    print("Polling to check if the index is ready. This may take up to a minute.")
    predicate=None
    if predicate is None:
    predicate = lambda index: index.get("queryable") is True
    while True:
    indices = list(collection.list_search_indexes(index_name))
    if len(indices) and predicate(indices[0]):
    break
    time.sleep(5)
    print(index_name + " is ready for querying.")
  2. 定义一个函数来运行向量搜索查询。

    运行此代码来创建一个名为 get_query_results 的检索函数,该函数运行基本的向量搜索查询。它使用 get_embedding 函数从搜索查询创建嵌入。然后,它运行查询以返回语义相似的文档。

    要了解更多信息,请参阅运行向量搜索查询。

    # Define a function to run vector search queries
    def get_query_results(query):
    """Gets results from a vector search query."""
    query_embedding = get_embedding(query)
    pipeline = [
    {
    "$vectorSearch": {
    "index": "vector_index",
    "queryVector": query_embedding,
    "path": "embedding",
    "exact": True,
    "limit": 5
    }
    }, {
    "$project": {
    "_id": 0,
    "text": 1
    }
    }
    ]
    results = collection.aggregate(pipeline)
    array_of_results = []
    for doc in results:
    array_of_results.append(doc)
    return array_of_results
    # Test the function with a sample query
    import pprint
    pprint.pprint(get_query_results("AI technology"))
    [{'text': 'more of our customers. We also see a tremendous opportunity to win '
    'more legacy workloads, as AI has now become a catalyst to modernize '
    'these\n'
    "applications. MongoDB's document-based architecture is "
    'particularly well-suited for the variety and scale of data required '
    'by AI-powered applications.'},
    {'text': 'artificial intelligence, in our offerings or partnerships; the '
    'growth and expansion of the market for database products and our '
    'ability to penetrate that\n'
    'market; our ability to integrate acquired businesses and '
    'technologies successfully or achieve the expected benefits of such '
    'acquisitions; our ability to'},
    {'text': 'MongoDB continues to expand its AI ecosystem with the announcement '
    'of the MongoDB AI Applications Program (MAAP),'},
    {'text': 'which provides customers with reference architectures, pre-built '
    'partner integrations, and professional services to help\n'
    'them quickly build AI-powered applications. Accenture will '
    'establish a center of excellence focused on MongoDB projects,\n'
    'and is the first global systems integrator to join MAAP.'},
    {'text': 'Bendigo and Adelaide Bank partnered with MongoDB to modernize '
    'their core banking technology. With the help of\n'
    'MongoDB Relational Migrator and generative AI-powered modernization '
    'tools, Bendigo and Adelaide Bank decomposed an\n'
    'outdated consumer-servicing application into microservices and '
    'migrated off its underlying legacy relational database'}]
4

在本节中,您将通过提示 LLM 使用检索到的文档作为上下文来生成响应。

将以下代码中的 <token> 替换为您的 Hugging Face 访问令牌,然后在笔记本中运行该代码。此代码执行以下操作:

  • 使用您定义的 get_query_results 函数从 Atlas 检索相关文档。

  • 使用用户的问题和检索到的文档作为上下文来创建提示。

  • 从 Hugging Face 的模型中心访问 Mistral 7B Instruct 模型。

  • LLM 提示有关 MongoDB 最新的 AI 公告。 产生的响应可能会有所不同。

import os
from huggingface_hub import InferenceClient
# Specify search query, retrieve relevant documents, and convert to string
query = "What are MongoDB's latest AI announcements?"
context_docs = get_query_results(query)
context_string = " ".join([doc["text"] for doc in context_docs])
# Construct prompt for the LLM using the retrieved documents as the context
prompt = f"""Use the following pieces of context to answer the question at the end.
{context_string}
Question: {query}
"""
# Authenticate to Hugging Face and access the model
os.environ["HF_TOKEN"] = "<token>"
llm = InferenceClient(
"mistralai/Mistral-7B-Instruct-v0.3",
token = os.getenv("HF_TOKEN"))
# Prompt the LLM (this code varies depending on the model you use)
output = llm.chat_completion(
messages=[{"role": "user", "content": prompt}],
max_tokens=150
)
print(output.choices[0].message.content)
MongoDB's latest AI announcements include the
MongoDB AI Applications Program (MAAP), a program designed
to help customers build AI-powered applications more efficiently.
Additionally, they have announced significant performance
improvements in MongoDB 8.0, featuring faster reads, updates,
bulk inserts, and time series queries. Another announcement is the
general availability of Atlas Stream Processing to build sophisticated,
event-driven applications with real-time data.

有关更详细的 RAG 教程,请参阅以下资源:

要开始使用 Atlas Vector Search 构建生产就绪的聊天机器人,可以使用 MongoDB 聊天机器人框架。此框架提供了一组库,使您能够快速构建 AI 聊天机器人应用程序。

要优化和微调 RAG 应用程序,请参阅如何衡量查询结果的准确性并提高向量搜索性能。

您还可以尝试不同的嵌入模型、分块策略和 LLM。要学习;了解详情,请参阅以下资源:

此外,Atlas Vector Search 支持高级检索系统。您能够把向量数据和Atlas中的其他数据无缝集成到索引中,这样可以对集合中的其他字段进行预过滤混合搜索,通过结合语义搜索和全文搜索的结果,更精细地调整检索结果。

后退

矢量量化