Docs 菜单
Docs 主页
/
MongoDB Atlas
/ / / /

在结果中突出显示搜索词语

在此页面上

  • 语法
  • 选项
  • 输出
  • 先决条件
  • 限制
  • 示例
  • 示例集合
  • 样本索引
  • 样本查询

Atlas Search highlight 选项将字段添加到结果集中,以在原始上下文中显示搜索词语。您可以将其与所有 $search 操作符结合使用,以显示返回的文档中出现的搜索词语以及相邻的文本内容(如有)。highlight 结果会作为 $meta 字段的一部分返回。

highlight 通过以下语法实现:

{
$search: {
"index": "<index name>", // optional, defaults to "default"
"<operator>": { // such as "text", "compound", or "phrase"
<operator-specification>
},
"highlight": {
"path": "<field-to-search>",
"maxCharsToExamine": "<number-of-chars-to-examine>", // optional, defaults to 500,000
"maxNumPassages": "<number-of-passages>" // optional, defaults to 5
}
}
},
{
$project: {
"highlights": { "$meta": "searchHighlights" }
}
}
字段
类型
说明
必需?
path
字符串

要搜索的文档字段。path 字段可能包含:

  • 字符串

  • 字符串数组

  • multi 分析器规范

  • 包含字符串和 multi 分析器规范组合的数组

  • 通配符 *

有关更多信息,请参阅构建查询路径

maxCharsToExamine
int
为字段执行突出显示时在文档上检查的最大字符数。如果省略,则默认为 500,000,这意味着 Atlas Search 仅检查每个文档的搜索字段中的前 500,000 个字符以进行突出显示。
no
maxNumPassages
int
每个字段的 highlights 结果中每个文档要返回的高分段落数。一段话大约是一个句子的长度。如果省略,则默认为 5,这意味着对于每个文档,Atlas Search 将返回与搜索文本匹配的前 5 个得分最高的段落。
no

"$meta": "searchHighlights"字段包含突出显示的结果。 该字段不是原始文档的一部分,因此需要使用 $project管道阶段将其添加到查询输出中。

highlights 字段是一个包含以下输出字段的数组:

字段
类型
说明
path
字符串
返回匹配项的文档字段。
texts
文档数组
每个搜索匹配都会返回一个或多个对象,其中包含匹配文本和周围文本(如果有)。
texts.value
字符串
返回匹配项的字段中的文本。
texts.type
字符串

结果类型。值可以是以下值之一:

  • hit — 结果包含与查询匹配的词语。

  • text — 结果包含与匹配词语相邻的文本内容。

score
float
分配给匹配结果的分数highlights 分数是衡量 highlights 对象与查询的相关性的指标。如果返回多个 highlights 对象,则最相关的 highlights 对象的分数最高。

您必须将要突出显示的字段编入 Atlas Search 字符串类型的索引,并将 indexOptions 设置为 offsets(默认)。

您不能将 Atlas Search highlight 选项与 embeddedDocument 操作符结合使用。

您可以在 Atlas Search Playground 或 Atlas 集群中尝试以下示例。

本页上的示例使用名为 fruit 的集合,其中包含以下文档:

{
"_id" : 1,
"type" : "fruit",
"summary" : "Apple varieties",
"description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.",
"category": "organic"
},
{
"_id" : 2,
"type" : "fruit",
"summary" : "Banana",
"description" : "Bananas are usually sold in bunches of five or six.",
"category": "nonorganic"
},
{
"_id" : 3,
"type" : "fruit",
"summary" : "Pear varieties",
"description" : "Bosc and Bartlett are the most common varieties of pears.",
"category": "nonorganic"
}

fruit 集合也有使用英文版分析器和动态字段映射的索引定义

{
"analyzer": "lucene.english",
"searchAnalyzer": "lucene.english",
"mappings": {
"dynamic": true
}
}

注意

突出显示的一个有用方面是,它显示搜索查询返回的原始文本,该文本可能与搜索词语不完全相同。例如,如果您使用语言特定的分析器,文本搜索返回搜索词语的所有词干 变体。

突出显示的另一个有用方面是,它可用于突出显示查询 path 内部或外部的任何字段。例如,在您搜索某个词语时,您可以在查询字段以及使用 highlight 选项指定的任何其他字段上突出显示查询词语。如要了解更多信息,请参阅多字段示例

以下查询演示了 Atlas Search 查询中的 $search highlight 选项。

以下查询在启用了 highlight 选项的情况下在fruit 集合的 description 字段中搜索 varietybunch

$project 管道阶段将输出限制到 description 字段,并添加一个名为 highlights 的新字段,其中包含突出显示信息。

1db.fruit.aggregate([
2 {
3 $search: {
4 "text": {
5 "path": "description",
6 "query": ["variety", "bunch"]
7 },
8 "highlight": {
9 "path": "description"
10 }
11 }
12 },
13 {
14 $project: {
15 "description": 1,
16 "_id": 0,
17 "highlights": { "$meta": "searchHighlights" }
18 }
19 }
20])
1{
2 "description" : "Bananas are usually sold in bunches of five or six. ",
3 "highlights" : [
4 {
5 "path" : "description",
6 "texts" : [
7 {
8 "value" : "Bananas are usually sold in ",
9 "type" : "text"
10 },
11 {
12 "value" : "bunches",
13 "type" : "hit"
14 },
15 {
16 "value" : " of five or six. ",
17 "type" : "text"
18 }
19 ],
20 "score" : 1.2841906547546387
21 }
22 ]
23}
24{
25 "description" : "Bosc and Bartlett are the most common varieties of pears.",
26 "highlights" : [
27 {
28 "path" : "description",
29 "texts" : [
30 {
31 "value" : "Bosc and Bartlett are the most common ",
32 "type" : "text"
33 },
34 {
35 "value" : "varieties",
36 "type" : "hit"
37 },
38 {
39 "value" : " of pears.",
40 "type" : "text"
41 }
42 ],
43 "score" : 1.2691514492034912
44 }
45 ]
46}
47{
48 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith. ",
49 "highlights" : [
50 {
51 "path" : "description",
52 "texts" : [
53 {
54 "value" : "Apples come in several ",
55 "type" : "text"
56 },
57 {
58 "value" : "varieties",
59 "type" : "hit"
60 },
61 {
62 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ",
63 "type" : "text"
64 }
65 ],
66 "score" : 1.0330637693405151
67 },
68 {
69 "path" : "description",
70 "texts" : [
71 {
72 "value" : "The most popular ",
73 "type" : "text"
74 },
75 {
76 "value" : "varieties",
77 "type" : "hit"
78 },
79 {
80 "value" : " are McIntosh, Gala, and Granny Smith. ",
81 "type" : "text"
82 }
83 ],
84 "score" : 1.0940992832183838
85 }
86 ]
87}

搜索词语 bunch 返回包含 _id: 2 的文档的匹配项,因为 description 字段包含词语 bunches。搜索词语 variety 返回包含 _id: 3_id: 1的文档的匹配项,因为 description 字段包含词语 varieties

➤ 尝试 Atlas Search Playground 中的示例。

以下查询在 fruit 集合的 description 字段中搜索 varietybunch,启用了 highlight 选项,将要检查的最大字符数设置为 40,并且仅为每个文档返回 1 个高分段落。

$project 管道阶段将输出限制到 description 字段,并添加一个名为 highlights 的新字段,其中包含突出显示信息。

1db.fruit.aggregate([
2 {
3 $search: {
4 "text": {
5 "path": "description",
6 "query": ["variety", "bunch"]
7 },
8 "highlight": {
9 "path": "description",
10 "maxNumPassages": 1,
11 "maxCharsToExamine": 40
12 }
13 }
14 },
15 {
16 $project: {
17 "description": 1,
18 "_id": 0,
19 "highlights": { "$meta": "searchHighlights" }
20 }
21 }
22])
1{
2 "description" : "Bananas are usually sold in bunches of five or six. ",
3 "highlights" : [
4 {
5 "path" : "description",
6 "texts" : [
7 {
8 "value" : "Bananas are usually sold in ",
9 "type" : "text"
10 },
11 {
12 "value" : "bunches",
13 "type" : "hit"
14 },
15 {
16 "value" : " of f",
17 "type" : "text"
18 }
19 ],
20 "score" : 1.313065767288208
21 }
22 ]
23}
24{
25 "description" : "Bosc and Bartlett are the most common varieties of pears.",
26 "highlights" : [ ]
27}
28{
29 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.",
30 "highlights" : [
31 {
32 "path" : "description",
33 "texts" : [
34 {
35 "value" : "Apples come in several ",
36 "type" : "text"
37 },
38 {
39 "value" : "varieties",
40 "type" : "hit"
41 },
42 {
43 "value" : ", includ",
44 "type" : "text"
45 }
46 ],
47 "score" : 0.9093900918960571
48 }
49 ]
50}

即使搜索字段包含搜索词语 varieties,上述结果中的第二个文档也包含空 highlights 数组,因为 Atlas Search 仅检查 40 个字符以进行突出显示。同样,将会截断词语 includ,因为 Atlas Search 仅检查搜索字段中的 40 个字符以进行突出显示。在第三个文档中,虽然多个段落包含搜索词语,但 Atlas Search 仅在 highlights结果中返回一个段落,因为查询仅要求 highlights 结果中的每个文档具有 1 个段落。

➤ 尝试 Atlas Search Playground 中的示例。

以下查询在fruit 集合的 description 字段中搜索 varieties,并为 descriptionsummary 字段启用了 highlight 选项。

$project 管道阶段添加一个名为 highlights 的新字段,它在 highlight 选项中的所有字段上包含查询词语的突出显示信息。

1db.fruit.aggregate([
2 {
3 $search: {
4 "text": {
5 "path": "description",
6 "query": "varieties"
7 },
8 "highlight": {
9 "path": ["description", "summary" ]
10 }
11 }
12 },
13 {
14 $project: {
15 "description": 1,
16 "summary": 1,
17 "_id": 0,
18 "highlights": { "$meta": "searchHighlights" }
19 }
20 }
21])
1{
2 "summary" : "Pear varieties",
3 "description" : "Bosc and Bartlett are the most common varieties of pears.",
4 "highlights" : [
5 {
6 "path" : "summary",
7 "texts" : [
8 {
9 "value" : "Pear ",
10 "type" : "text"
11 },
12 {
13 "value" : "varieties",
14 "type" : "hit"
15 }
16 ],
17 "score" : 1.3891443014144897 },
18 {
19 "path" : "description",
20 "texts" : [
21 {
22 "value" : "Bosc and Bartlett are the most common ",
23 "type" : "text"
24 },
25 {
26 "value" : "varieties",
27 "type" : "hit"
28 },
29 {
30 "value" : " of pears.",
31 "type" : "text"
32 }
33 ],
34 "score" : 1.2691514492034912
35 }
36 ]
37}
38{
39 "summary" : "Apple varieties",
40 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.",
41 "highlights" : [
42 {
43 "path" : "summary",
44 "texts" : [
45 {
46 "value" : "Apple ",
47 "type" : "text"
48 },
49 {
50 "value" : "varieties",
51 "type" : "hit"
52 }
53 ],
54 "score" : 1.3859853744506836
55 },
56 {
57 "path" : "description",
58 "texts" : [
59 {
60 "value" : "Apples come in several ",
61 "type" : "text"
62 },
63 {
64 "value" : "varieties",
65 "type" : "hit"
66 },
67 {
68 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ",
69 "type" : "text"
70 }
71 ],
72 "score" : 1.0330637693405151
73 },
74 {
75 "path" : "description",
76 "texts" : [
77 {
78 "value" : "The most popular ",
79 "type" : "text"
80 },
81 {
82 "value" : "varieties",
83 "type" : "hit"
84 },
85 {
86 "value" : " are McIntosh, Gala, and Granny Smith.",
87 "type" : "text"
88 }
89 ],
90 "score" : 1.0940992832183838
91 }
92 ]
93}

搜索词语 varieties 返回包含 _id: 1_id: 3 的文档的匹配项,因为这两个文档中的查询字段 description 包含查询词语 varieties。此外,highlights 数组还包括 summary 字段,因为该字段包含查询词语 varieties

➤ 尝试 Atlas Search Playground 中的示例。

以下查询在 fruit 集合中以 des 开头的字段中搜索词语 varieties,并为以 des 开头的字段启用了 highlight 选项。

$project 管道阶段添加一个名为 highlights 的新字段,其中包含突出显示的信息。

1db.fruit.aggregate([
2 {
3 "$search": {
4 "text": {
5 "path": {"wildcard": "des*"},
6 "query": ["variety"]
7 },
8 "highlight": {
9 "path": {"wildcard": "des*"}
10 }
11 }
12 },
13 {
14 "$project": {
15 "description": 1,
16 "_id": 0,
17 "highlights": { "$meta": "searchHighlights" }
18 }
19 }
20])
1{
2 "description" : "Bosc and Bartlett are the most common varieties of pears.",
3 "highlights" : [
4 {
5 "path" : "description",
6 "texts" : [
7 {
8 "value" : "Bosc and Bartlett are the most common ",
9 "type" : "text"
10 },
11 {
12 "value" : "varieties",
13 "type" : "hit"
14 },
15 {
16 "value" : " of pears.",
17 "type" : "text"
18 }
19 ],
20 "score" : 1.2691514492034912
21 }
22 ]
23},
24{
25 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.",
26 "highlights" : [
27 {
28 "path" : "description",
29 "texts" : [
30 {
31 "value" : "Apples come in several ",
32 "type" : "text"
33 },
34 {
35 "value" : "varieties",
36 "type" : "hit"
37 },
38 {
39 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ",
40 "type" : "text"
41 }
42 ],
43 "score" : 1.0330637693405151
44 },
45 {
46 "path" : "description",
47 "texts" : [
48 {
49 "value" : "The most popular ",
50 "type" : "text"
51 },
52 {
53 "value" : "varieties",
54 "type" : "hit"
55 },
56 {
57 "value" : " are McIntosh, Gala, and Granny Smith.",
58 "type" : "text"
59 }
60 ],
61 "score" : 1.0940992832183838
62 }
63 ]
64}

在 Atlas Search 结果中,以 des 开头的字段会突出显示。

➤ 尝试 Atlas Search Playground 中的示例。

以下查询在 category 字段中搜索词语 organic,并在 description 字段中搜索词语 variety$search compound 查询中的 highlight 选项仅为针对 description 字段的 text 查询请求突出显示信息。请注意,$search 阶段中的 highlight 选项必须是 $search 阶段的子阶段,而不是 $search 阶段中的任何操作符的子操作符。

$project 管道阶段添加一个名为 highlights 的新字段,其中包含突出显示的信息。

1db.fruit.aggregate([
2 {
3 "$search": {
4 "compound": {
5 "should": [{
6 "text": {
7 "path": "category",
8 "query": "organic"
9 }
10 },
11 {
12 "text": {
13 "path": "description",
14 "query": "variety"
15 }
16 }]
17 },
18 "highlight": {
19 "path": "description"
20 }
21 }
22 },
23 {
24 "$project": {
25 "description": 1,
26 "category": 1,
27 "_id": 0,
28 "highlights": { "$meta": "searchHighlights" }
29 }
30 }
31])
1[
2 {
3 description: 'Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.',
4 category: 'organic',
5 highlights: [
6 {
7 score: 1.0330637693405151,
8 path: 'description',
9 texts: [
10 { value: 'Apples come in several ', type: 'text' },
11 { value: 'varieties', type: 'hit' },
12 {
13 value: ', including Fuji, Granny Smith, and Honeycrisp. ',
14 type: 'text'
15 }
16 ]
17 },
18 {
19 score: 1.0940992832183838,
20 path: 'description',
21 texts: [
22 { value: 'The most popular ', type: 'text' },
23 { value: 'varieties', type: 'hit' },
24 {
25 value: ' are McIntosh, Gala, and Granny Smith.',
26 type: 'text'
27 }
28 ]
29 }
30 ]
31 },
32 {
33 description: 'Bosc and Bartlett are the most common varieties of pears.',
34 category: 'nonorganic',
35 highlights: [
36 {
37 score: 1.2691514492034912,
38 path: 'description',
39 texts: [
40 {
41 value: 'Bosc and Bartlett are the most common ',
42 type: 'text'
43 },
44 { value: 'varieties', type: 'hit' },
45 { value: ' of pears.', type: 'text' }
46 ]
47 }
48 ]
49 }
50]

➤ 尝试 Atlas Search Playground 中的示例。

在本例中,fruit 集合也有如下索引定义。

{
"mappings": {
"dynamic": false,
"fields": {
"description": [
{
"type": "autocomplete",
"tokenization": "edgeGram",
"minGrams": 2,
"maxGrams": 15,
"foldDiacritics": true
}
]
}
}
}

以下查询在fruit 集合的 description 字段中搜索 var 字符,并为 description 字段启用了 highlight 选项。

$project 管道阶段添加一个名为 highlights 的新字段,其中包含突出显示的信息。

重要

要突出显示路径的自动完成索引版本,autocomplete 操作符必须是在查询中使用该路径的唯一操作符。

1db.fruit.aggregate([
2 {
3 "$search": {
4 "autocomplete": {
5 "path": "description",
6 "query": ["var"]
7 },
8 "highlight": {
9 "path": "description"
10 }
11 }
12 },
13 {
14 "$project": {
15 "description": 1,
16 "_id": 0,
17 "highlights": { "$meta": "searchHighlights" }
18 }
19 }
20])
1{
2 "description": "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.",
3 "highlights": [
4 {
5 "score": 0.774385392665863,
6 "path": "description",
7 "texts": [
8 { "value": "Apples come in several ", "type": "text" },
9 { "value": "varieties, including Fuji", "type": "hit" },
10 { "value": ", Granny Smith, and Honeycrisp. ", "type": "text" }
11 ]
12 },
13 {
14 "score": 0.7879307270050049,
15 "path": "description",
16 "texts": [
17 { "value": "The most popular ", "type": "text" },
18 { "value": "varieties are McIntosh", "type": "hit" },
19 { "value": ", Gala, and Granny Smith.", "type": "text" }
20 ]
21 }
22 ]
23},
24{
25 "description": "Bosc and Bartlett are the most common varieties of pears.",
26 "highlights": [
27 {
28 "score": 0.9964432120323181,
29 "path": "description",
30 "texts": [
31 {
32 "value": "Bosc and Bartlett are the most common ",
33 "type": "text"
34 },
35 { "value": "varieties of pears", "type": "hit" },
36 { "value": ".", "type": "text" }
37 ]
38 }
39 ]
40}

Atlas Search 为查询字符串 var 返回包含 _id: 1id_: 2 的文档的匹配项,因为 fruit 集合中的 description 字段在词语开头包含 var 字符。如果仅在突出显示的查询的 autocomplete 操作符中引用突出显示的路径,则 Atlas Search 更粗略地将突出显示 hit 与查询词语进行匹配。

➤ 尝试 Atlas Search Playground 中的示例。