在结果中突出显示搜索词语
Atlas Search highlight
选项将字段添加到结果集中,以在原始上下文中显示搜索词语。您可以将其与所有 $search 操作符结合使用,以显示返回的文档中出现的搜索词语以及相邻的文本内容(如有)。highlight
结果会作为 $meta
字段的一部分返回。
语法
highlight
通过以下语法实现:
{ $search: { "index": "<index name>", // optional, defaults to "default" "<operator>": { // such as "text", "compound", or "phrase" <operator-specification> }, "highlight": { "path": "<field-to-search>", "maxCharsToExamine": "<number-of-chars-to-examine>", // optional, defaults to 500,000 "maxNumPassages": "<number-of-passages>" // optional, defaults to 5 } } }, { $project: { "highlights": { "$meta": "searchHighlights" } } }
选项
字段 | 类型 | 说明 | 必需? |
---|---|---|---|
path | 字符串 | 是 | |
maxCharsToExamine | int | 为字段执行突出显示时在文档上检查的最大字符数。如果省略,则默认为 500,000 ,这意味着 Atlas Search 仅检查每个文档的搜索字段中的前 500,000 个字符以进行突出显示。 | no |
maxNumPassages | int | 每个字段的 highlights 结果中每个文档要返回的高分段落数。一段话大约是一个句子的长度。如果省略,则默认为 5,这意味着对于每个文档,Atlas Search 将返回与搜索文本匹配的前 5 个得分最高的段落。 | no |
"$meta": "searchHighlights"
字段包含突出显示的结果。 该字段不是原始文档的一部分,因此需要使用 $project管道阶段将其添加到查询输出中。
输出
highlights
字段是一个包含以下输出字段的数组:
字段 | 类型 | 说明 |
---|---|---|
path | 字符串 | 返回匹配项的文档字段。 |
texts | 文档数组 | 每个搜索匹配都会返回一个或多个对象,其中包含匹配文本和周围文本(如果有)。 |
texts.value | 字符串 | 返回匹配项的字段中的文本。 |
texts.type | 字符串 | 结果类型。值可以是以下值之一:
|
score | float |
先决条件
您必须将要突出显示的字段编入 Atlas Search 字符串类型的索引,并将 indexOptions
设置为 offsets
(默认)。
限制
您不能将 Atlas Search highlight
选项与 embeddedDocument 操作符结合使用。
示例
您可以在 Atlas Search Playground 或 Atlas 集群中尝试以下示例。
示例集合
本页上的示例使用名为 fruit
的集合,其中包含以下文档:
{ "_id" : 1, "type" : "fruit", "summary" : "Apple varieties", "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.", "category": "organic" }, { "_id" : 2, "type" : "fruit", "summary" : "Banana", "description" : "Bananas are usually sold in bunches of five or six.", "category": "nonorganic" }, { "_id" : 3, "type" : "fruit", "summary" : "Pear varieties", "description" : "Bosc and Bartlett are the most common varieties of pears.", "category": "nonorganic" }
样本索引
fruit
集合也有使用英文版分析器和动态字段映射的索引定义。
{ "analyzer": "lucene.english", "searchAnalyzer": "lucene.english", "mappings": { "dynamic": true } }
样本查询
以下查询演示了 Atlas Search 查询中的 $search
highlight
选项。
基本示例
以下查询在启用了 highlight
选项的情况下在fruit
集合的 description
字段中搜索 variety
和 bunch
。
$project 管道阶段将输出限制到 description
字段,并添加一个名为 highlights
的新字段,其中包含突出显示信息。
1 db.fruit.aggregate([ 2 { 3 $search: { 4 "text": { 5 "path": "description", 6 "query": ["variety", "bunch"] 7 }, 8 "highlight": { 9 "path": "description" 10 } 11 } 12 }, 13 { 14 $project: { 15 "description": 1, 16 "_id": 0, 17 "highlights": { "$meta": "searchHighlights" } 18 } 19 } 20 ])
1 { 2 "description" : "Bananas are usually sold in bunches of five or six. ", 3 "highlights" : [ 4 { 5 "path" : "description", 6 "texts" : [ 7 { 8 "value" : "Bananas are usually sold in ", 9 "type" : "text" 10 }, 11 { 12 "value" : "bunches", 13 "type" : "hit" 14 }, 15 { 16 "value" : " of five or six. ", 17 "type" : "text" 18 } 19 ], 20 "score" : 1.2841906547546387 21 } 22 ] 23 } 24 { 25 "description" : "Bosc and Bartlett are the most common varieties of pears.", 26 "highlights" : [ 27 { 28 "path" : "description", 29 "texts" : [ 30 { 31 "value" : "Bosc and Bartlett are the most common ", 32 "type" : "text" 33 }, 34 { 35 "value" : "varieties", 36 "type" : "hit" 37 }, 38 { 39 "value" : " of pears.", 40 "type" : "text" 41 } 42 ], 43 "score" : 1.2691514492034912 44 } 45 ] 46 } 47 { 48 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith. ", 49 "highlights" : [ 50 { 51 "path" : "description", 52 "texts" : [ 53 { 54 "value" : "Apples come in several ", 55 "type" : "text" 56 }, 57 { 58 "value" : "varieties", 59 "type" : "hit" 60 }, 61 { 62 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ", 63 "type" : "text" 64 } 65 ], 66 "score" : 1.0330637693405151 67 }, 68 { 69 "path" : "description", 70 "texts" : [ 71 { 72 "value" : "The most popular ", 73 "type" : "text" 74 }, 75 { 76 "value" : "varieties", 77 "type" : "hit" 78 }, 79 { 80 "value" : " are McIntosh, Gala, and Granny Smith. ", 81 "type" : "text" 82 } 83 ], 84 "score" : 1.0940992832183838 85 } 86 ] 87 }
搜索词语 bunch
返回包含 _id: 2
的文档的匹配项,因为 description
字段包含词语 bunches
。搜索词语 variety
返回包含 _id: 3
和 _id: 1
的文档的匹配项,因为 description
字段包含词语 varieties
。
➤ 尝试 Atlas Search Playground 中的示例。
高级示例
以下查询在 fruit
集合的 description
字段中搜索 variety
和 bunch
,启用了 highlight
选项,将要检查的最大字符数设置为 40
,并且仅为每个文档返回 1
个高分段落。
$project 管道阶段将输出限制到 description
字段,并添加一个名为 highlights
的新字段,其中包含突出显示信息。
1 db.fruit.aggregate([ 2 { 3 $search: { 4 "text": { 5 "path": "description", 6 "query": ["variety", "bunch"] 7 }, 8 "highlight": { 9 "path": "description", 10 "maxNumPassages": 1, 11 "maxCharsToExamine": 40 12 } 13 } 14 }, 15 { 16 $project: { 17 "description": 1, 18 "_id": 0, 19 "highlights": { "$meta": "searchHighlights" } 20 } 21 } 22 ])
1 { 2 "description" : "Bananas are usually sold in bunches of five or six. ", 3 "highlights" : [ 4 { 5 "path" : "description", 6 "texts" : [ 7 { 8 "value" : "Bananas are usually sold in ", 9 "type" : "text" 10 }, 11 { 12 "value" : "bunches", 13 "type" : "hit" 14 }, 15 { 16 "value" : " of f", 17 "type" : "text" 18 } 19 ], 20 "score" : 1.313065767288208 21 } 22 ] 23 } 24 { 25 "description" : "Bosc and Bartlett are the most common varieties of pears.", 26 "highlights" : [ ] 27 } 28 { 29 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.", 30 "highlights" : [ 31 { 32 "path" : "description", 33 "texts" : [ 34 { 35 "value" : "Apples come in several ", 36 "type" : "text" 37 }, 38 { 39 "value" : "varieties", 40 "type" : "hit" 41 }, 42 { 43 "value" : ", includ", 44 "type" : "text" 45 } 46 ], 47 "score" : 0.9093900918960571 48 } 49 ] 50 }
即使搜索字段包含搜索词语 varieties
,上述结果中的第二个文档也包含空 highlights
数组,因为 Atlas Search 仅检查 40
个字符以进行突出显示。同样,将会截断词语 includ
,因为 Atlas Search 仅检查搜索字段中的 40
个字符以进行突出显示。在第三个文档中,虽然多个段落包含搜索词语,但 Atlas Search 仅在 highlights
结果中返回一个段落,因为查询仅要求 highlights
结果中的每个文档具有 1
个段落。
➤ 尝试 Atlas Search Playground 中的示例。
多字段示例
以下查询在fruit
集合的 description
字段中搜索 varieties
,并为 description
和 summary
字段启用了 highlight
选项。
$project 管道阶段添加一个名为 highlights
的新字段,它在 highlight
选项中的所有字段上包含查询词语的突出显示信息。
1 db.fruit.aggregate([ 2 { 3 $search: { 4 "text": { 5 "path": "description", 6 "query": "varieties" 7 }, 8 "highlight": { 9 "path": ["description", "summary" ] 10 } 11 } 12 }, 13 { 14 $project: { 15 "description": 1, 16 "summary": 1, 17 "_id": 0, 18 "highlights": { "$meta": "searchHighlights" } 19 } 20 } 21 ])
1 { 2 "summary" : "Pear varieties", 3 "description" : "Bosc and Bartlett are the most common varieties of pears.", 4 "highlights" : [ 5 { 6 "path" : "summary", 7 "texts" : [ 8 { 9 "value" : "Pear ", 10 "type" : "text" 11 }, 12 { 13 "value" : "varieties", 14 "type" : "hit" 15 } 16 ], 17 "score" : 1.3891443014144897 }, 18 { 19 "path" : "description", 20 "texts" : [ 21 { 22 "value" : "Bosc and Bartlett are the most common ", 23 "type" : "text" 24 }, 25 { 26 "value" : "varieties", 27 "type" : "hit" 28 }, 29 { 30 "value" : " of pears.", 31 "type" : "text" 32 } 33 ], 34 "score" : 1.2691514492034912 35 } 36 ] 37 } 38 { 39 "summary" : "Apple varieties", 40 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.", 41 "highlights" : [ 42 { 43 "path" : "summary", 44 "texts" : [ 45 { 46 "value" : "Apple ", 47 "type" : "text" 48 }, 49 { 50 "value" : "varieties", 51 "type" : "hit" 52 } 53 ], 54 "score" : 1.3859853744506836 55 }, 56 { 57 "path" : "description", 58 "texts" : [ 59 { 60 "value" : "Apples come in several ", 61 "type" : "text" 62 }, 63 { 64 "value" : "varieties", 65 "type" : "hit" 66 }, 67 { 68 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ", 69 "type" : "text" 70 } 71 ], 72 "score" : 1.0330637693405151 73 }, 74 { 75 "path" : "description", 76 "texts" : [ 77 { 78 "value" : "The most popular ", 79 "type" : "text" 80 }, 81 { 82 "value" : "varieties", 83 "type" : "hit" 84 }, 85 { 86 "value" : " are McIntosh, Gala, and Granny Smith.", 87 "type" : "text" 88 } 89 ], 90 "score" : 1.0940992832183838 91 } 92 ] 93 }
搜索词语 varieties
返回包含 _id: 1
和 _id: 3
的文档的匹配项,因为这两个文档中的查询字段 description
包含查询词语 varieties
。此外,highlights
数组还包括 summary
字段,因为该字段包含查询词语 varieties
。
➤ 尝试 Atlas Search Playground 中的示例。
通配符示例
以下查询在 fruit
集合中以 des
开头的字段中搜索词语 varieties
,并为以 des
开头的字段启用了 highlight
选项。
$project 管道阶段添加一个名为 highlights
的新字段,其中包含突出显示的信息。
1 db.fruit.aggregate([ 2 { 3 "$search": { 4 "text": { 5 "path": {"wildcard": "des*"}, 6 "query": ["variety"] 7 }, 8 "highlight": { 9 "path": {"wildcard": "des*"} 10 } 11 } 12 }, 13 { 14 "$project": { 15 "description": 1, 16 "_id": 0, 17 "highlights": { "$meta": "searchHighlights" } 18 } 19 } 20 ])
1 { 2 "description" : "Bosc and Bartlett are the most common varieties of pears.", 3 "highlights" : [ 4 { 5 "path" : "description", 6 "texts" : [ 7 { 8 "value" : "Bosc and Bartlett are the most common ", 9 "type" : "text" 10 }, 11 { 12 "value" : "varieties", 13 "type" : "hit" 14 }, 15 { 16 "value" : " of pears.", 17 "type" : "text" 18 } 19 ], 20 "score" : 1.2691514492034912 21 } 22 ] 23 }, 24 { 25 "description" : "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.", 26 "highlights" : [ 27 { 28 "path" : "description", 29 "texts" : [ 30 { 31 "value" : "Apples come in several ", 32 "type" : "text" 33 }, 34 { 35 "value" : "varieties", 36 "type" : "hit" 37 }, 38 { 39 "value" : ", including Fuji, Granny Smith, and Honeycrisp. ", 40 "type" : "text" 41 } 42 ], 43 "score" : 1.0330637693405151 44 }, 45 { 46 "path" : "description", 47 "texts" : [ 48 { 49 "value" : "The most popular ", 50 "type" : "text" 51 }, 52 { 53 "value" : "varieties", 54 "type" : "hit" 55 }, 56 { 57 "value" : " are McIntosh, Gala, and Granny Smith.", 58 "type" : "text" 59 } 60 ], 61 "score" : 1.0940992832183838 62 } 63 ] 64 }
在 Atlas Search 结果中,以 des
开头的字段会突出显示。
➤ 尝试 Atlas Search Playground 中的示例。
复合示例
以下查询在 category
字段中搜索词语 organic
,并在 description
字段中搜索词语 variety
。$search
compound 查询中的 highlight
选项仅为针对 description
字段的 text 查询请求突出显示信息。请注意,$search
阶段中的 highlight
选项必须是 $search
阶段的子阶段,而不是 $search
阶段中的任何操作符的子操作符。
$project 管道阶段添加一个名为 highlights
的新字段,其中包含突出显示的信息。
1 db.fruit.aggregate([ 2 { 3 "$search": { 4 "compound": { 5 "should": [{ 6 "text": { 7 "path": "category", 8 "query": "organic" 9 } 10 }, 11 { 12 "text": { 13 "path": "description", 14 "query": "variety" 15 } 16 }] 17 }, 18 "highlight": { 19 "path": "description" 20 } 21 } 22 }, 23 { 24 "$project": { 25 "description": 1, 26 "category": 1, 27 "_id": 0, 28 "highlights": { "$meta": "searchHighlights" } 29 } 30 } 31 ])
1 [ 2 { 3 description: 'Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.', 4 category: 'organic', 5 highlights: [ 6 { 7 score: 1.0330637693405151, 8 path: 'description', 9 texts: [ 10 { value: 'Apples come in several ', type: 'text' }, 11 { value: 'varieties', type: 'hit' }, 12 { 13 value: ', including Fuji, Granny Smith, and Honeycrisp. ', 14 type: 'text' 15 } 16 ] 17 }, 18 { 19 score: 1.0940992832183838, 20 path: 'description', 21 texts: [ 22 { value: 'The most popular ', type: 'text' }, 23 { value: 'varieties', type: 'hit' }, 24 { 25 value: ' are McIntosh, Gala, and Granny Smith.', 26 type: 'text' 27 } 28 ] 29 } 30 ] 31 }, 32 { 33 description: 'Bosc and Bartlett are the most common varieties of pears.', 34 category: 'nonorganic', 35 highlights: [ 36 { 37 score: 1.2691514492034912, 38 path: 'description', 39 texts: [ 40 { 41 value: 'Bosc and Bartlett are the most common ', 42 type: 'text' 43 }, 44 { value: 'varieties', type: 'hit' }, 45 { value: ' of pears.', type: 'text' } 46 ] 47 } 48 ] 49 } 50 ]
➤ 尝试 Atlas Search Playground 中的示例。
自动完成示例
在本例中,fruit
集合也有如下索引定义。
{ "mappings": { "dynamic": false, "fields": { "description": [ { "type": "autocomplete", "tokenization": "edgeGram", "minGrams": 2, "maxGrams": 15, "foldDiacritics": true } ] } } }
以下查询在fruit
集合的 description
字段中搜索 var
字符,并为 description
字段启用了 highlight
选项。
$project 管道阶段添加一个名为 highlights
的新字段,其中包含突出显示的信息。
重要
要突出显示路径的自动完成索引版本,autocomplete 操作符必须是在查询中使用该路径的唯一操作符。
1 db.fruit.aggregate([ 2 { 3 "$search": { 4 "autocomplete": { 5 "path": "description", 6 "query": ["var"] 7 }, 8 "highlight": { 9 "path": "description" 10 } 11 } 12 }, 13 { 14 "$project": { 15 "description": 1, 16 "_id": 0, 17 "highlights": { "$meta": "searchHighlights" } 18 } 19 } 20 ])
1 { 2 "description": "Apples come in several varieties, including Fuji, Granny Smith, and Honeycrisp. The most popular varieties are McIntosh, Gala, and Granny Smith.", 3 "highlights": [ 4 { 5 "score": 0.774385392665863, 6 "path": "description", 7 "texts": [ 8 { "value": "Apples come in several ", "type": "text" }, 9 { "value": "varieties, including Fuji", "type": "hit" }, 10 { "value": ", Granny Smith, and Honeycrisp. ", "type": "text" } 11 ] 12 }, 13 { 14 "score": 0.7879307270050049, 15 "path": "description", 16 "texts": [ 17 { "value": "The most popular ", "type": "text" }, 18 { "value": "varieties are McIntosh", "type": "hit" }, 19 { "value": ", Gala, and Granny Smith.", "type": "text" } 20 ] 21 } 22 ] 23 }, 24 { 25 "description": "Bosc and Bartlett are the most common varieties of pears.", 26 "highlights": [ 27 { 28 "score": 0.9964432120323181, 29 "path": "description", 30 "texts": [ 31 { 32 "value": "Bosc and Bartlett are the most common ", 33 "type": "text" 34 }, 35 { "value": "varieties of pears", "type": "hit" }, 36 { "value": ".", "type": "text" } 37 ] 38 } 39 ] 40 }
Atlas Search 为查询字符串 var
返回包含 _id: 1
和 id_: 2
的文档的匹配项,因为 fruit
集合中的 description
字段在词语开头包含 var
字符。如果仅在突出显示的查询的 autocomplete 操作符中引用突出显示的路径,则 Atlas Search 更粗略地将突出显示 hit
与查询词语进行匹配。
➤ 尝试 Atlas Search Playground 中的示例。