为自管理部署上的文本搜索结果分配权重

在此页面上

关于此任务

开始之前
步骤
结果
content 和 about 字段中的匹配项
keywords 和 about 字段中的匹配项
单个文档中的多个匹配项
了解详情

MongoDB 返回文本搜索结果时，会给每个返回的文档分配一个分数。该分数表示文档与给定搜索查询的相关性。您可以按分数将返回的文档排序，让最相关的文档最先出现在结果集中。

如果您的复合索引包含多个文本索引键，则可以为每个索引字段指定不同的权重。索引字段的权重表示该字段相对于其他索引字段的重要性，权重越高，文本搜索分数越高。

例如，如果您知道用户可能会搜索标题，或者如果 title 包含与其他文档字段相比更相关的搜索术语，则可以在 title 字段上强调搜索匹配。

索引字段的索引默认权重为 1。要调整索引字段的权重，请在 db.collection.createIndex() 方法中加入权重选项，如以下示例所示：

db.<collection>.createIndex(
   {
     <field1>: "text",
     <field2>: "text",
     ...
   },
   {
     weights: {
       <field1>: <weight>,
       <field2>: <weight>,
       ...
     },
     name: <indexName>
   }
 )

重要

如果在创建索引后更改索引的权重，MongoDB 需要对该集合重建索引。重建索引可能会对性能产生负面影响，尤其是对大型集合而言。如需了解更多信息，请参阅在填充集合上构建索引。

关于此任务

您有一个 blog 集合，其中包含各个博文的文档。每个文档包含：

帖子的内容。
帖子涵盖的主题。
与此帖子相关的关键字列表。

您希望创建文本索引，以便用户可以对博文执行文本搜索。您的应用程序支持对内容、主题和关键字进行搜索。

您希望优先匹配 content 字段，而不是其他文档字段。使用索引权重为 content 的匹配赋予更高的重要性，并对查询结果进行排序，使 content 的匹配项优先出现。

开始之前

使用以下文档创建 blog 集合：

db.blog.insertMany( [
   {
     _id: 1,
     content: "This morning I had a cup of coffee.",
     about: "beverage",
     keywords: [ "coffee" ]
   },
   {
     _id: 2,
     content: "Who likes chocolate ice cream for dessert?",
     about: "food",
     keywords: [ "poll" ]
   },
   {
     _id: 3,
     content: "My favorite flavors are strawberry and coffee",
     about: "ice cream",
     keywords: [ "food", "dessert" ]
   }
] )

步骤

为每个索引字段创建具有不同权重的 text 索引：

db.blog.createIndex(
   {
     content: "text",
     keywords: "text",
     about: "text"
   },
   {
     weights: {
       content: 10,
       keywords: 5
     },
     name: "BlogTextIndex"
   }
 )

text 索引有以下字段和权重：

content 权重为 10。
keywords 权重为 5。
about 默认权重为 1。

这些权重表示索引字段之间的相对重要性。

结果

以下示例显示索引字段的不同权重如何影响结果分数。每个示例都根据每个文档的 textScore 对结果进行排序。要访问文档的 textScore 属性，请使用 $meta 操作符。

和字段中的匹配项`contentabout`

以下查询在 blog 集合中的文档中搜索字符串 ice cream：

db.blog.find(
   {
      $text: { $search: "ice cream" }
   },
   {
      score: { $meta: "textScore" }
   }
).sort( { score: { $meta: "textScore" } } )

输出：

[
  {
    _id: 2,
    content: 'Who likes chocolate ice cream for dessert?',
    about: 'food',
    keywords: [ 'food', 'poll' ],
    score: 12
  },
  {
    _id: 3,
    content: 'My favorite flavors are strawberry and coffee',
    about: 'ice cream',
    keywords: [ 'food', 'dessert' ],
    score: 1.5
  }
]

搜索字符串 ice cream 匹配：

文档中带有 _id: 2 的 content 字段。
文档中带有 _id: 3 的 about 字段。

content 字段中术语匹配的影响力（10:1 权重）是 keywords 字段中术语匹配的 10 倍。

和字段中的匹配项`keywordsabout`

以下查询在 blog 集合中的文档中搜索字符串 food：

db.blog.find(
   {
      $text: { $search: "food" }
   },
   {
      score: { $meta: "textScore" }
   }
).sort( { score: { $meta: "textScore" } } )

输出：

[
  {
    _id: 3,
    content: 'My favorite flavors are strawberry and coffee',
    about: 'ice cream',
    keywords: [ 'food', 'dessert' ],
    score: 5.5
  },
  {
    _id: 2,
    content: "Who likes chocolate ice cream for dessert?",
    about: 'food',
    keywords: [ 'poll' ],
    score: 1.1
  }
]

搜索字符串 food 匹配：

文档中带有 _id: 3 的 keywords 字段。
文档中带有 _id: 2 的 about 字段。

keywords 字段中术语匹配的影响力（5:1 权重）是 about 字段中术语匹配的 5 倍。

单个文档中的多个匹配项

以下查询在 blog 集合中的文档中搜索字符串 coffee：

db.blog.find(
   {
      $text: { $search: "coffee" }
   },
   {
      score: { $meta: "textScore" }
   }
).sort( { score: { $meta: "textScore" } } )

输出：

[
  {
    _id: 1,
    content: 'This morning I had a cup of coffee.',
    about: 'beverage',
    keywords: [ 'coffee' ],
    score: 11.666666666666666
  },
  {
    _id: 3,
    content: 'My favorite cake flavors are strawberry and coffee',
    about: 'ice cream',
    keywords: [ 'food', 'dessert' ],
    score: 6
  }
]

搜索字符串 coffee 匹配：

文档中带有 _id: 1 的 content 和 keywords 字段。
文档中带有 _id: 3 的 content 字段。

为了在搜索字符串匹配多个字段时计算 score，MongoDB 会将匹配的字段数量乘以相应字段的权重，然后对结果求和。

了解详情

要了解有关 MongoDB 中文本搜索的更多信息，请参阅：

注意

Atlas Search

对于 MongoDB Atlas 上托管的数据， Atlas Search提供比text索引更强大的自定义评分。要了解更多信息，请参阅 Atlas Search评分文档。

后退

现场使用

来年

限制条目数

重要