如何定义自定义分析器并运行 Atlas Search 不区分变音符号的查询

在此页面上

创建 Atlas Search 索引

搜索集合

本教程介绍如何创建使用自定义分析器的索引，并针对 sample_mflix.movies 集合运行不区分变音符号的查询。本教程将引导您完成以下步骤：

在sample_mflix.movies collection中的title和genres字段上设置 Atlas Search 索引。
使用通配符和文本 title操作符，针对genres collection中的和字段运行 Atlas Search 查询。sample_mflix.movies

开始之前，请确保 Atlas 集群满足先决条件中所述的要求。

要创建 Atlas Search 索引，您必须拥有 Project Data Access Admin 或更高的项目访问权限。

创建 Atlas Search 索引

在本部分中，您将在sample_mflix.movies collection中的title和genres 字段上创建 Atlas Search 索引。

AtlasGoClusters在Atlas中，Go项目的页面。

如果尚未显示，请从导航栏上的 Organizations 菜单中选择包含所需项目的组织。
如果尚未显示，请从导航栏的Projects菜单中选择所需的项目。
如果尚未出现，请单击侧边栏中的 Clusters（集群）。
显示集群页面。

转到集群的 Atlas Search 页面。

您可以从侧边栏、 Data Explorer 或集群详细信息页面转到 Atlas Search 页面。

在侧边栏中，单击 Services 标题下的 Atlas Search。
注意
如果没有集群，则请单击 Create cluster 来创建一个。要了解更多信息，请参阅创建集群。
从 Select data source 下拉菜单中选择您的集群并单击 Go to Atlas Search。
将显示 Atlas Search 页面。

单击集群的对应 Browse Collections 按钮。
展开数据库并选择集合。
单击该集合的 Search Indexes 标签页。
将显示 Atlas Search 页面。

单击集群的名称。
单击 Atlas Search 标签页。
将显示 Atlas Search 页面。

单击 Create Search Index（保存并关闭）。

开始您的索引配置。

在页面上进行以下选择，然后单击 Next。

Search Type	选择 Atlas Search 索引类型。
Index Name and Data Source	指定以下信息： Index Name: `diacritic-insensitive-tutorial` Database and Collection: `sample_mflix` database `movies` 集合
Configuration Method	For a guided experience, select Visual Editor. To edit the raw index definition, select JSON Editor.

指定索引定义。

genres和title字段的索引定义使用以下内容指定自定义分析器diacriticFolder ：

关键字分词器，将整个输入标记为单个词元。
icuFolding 词元筛选器，用于应用字符折叠，例如删除重音和大小写折叠。

索引定义指定了genres和title字段的字符串类型。它还对title字段应用名为diacriticFolder的自定义分析器。

单击 Refine Your Index（连接）。
在 Custom Analyzers 部分中，单击 Add Custom Analyzer。
选择 Create Your Own 单选按钮并单击 Next。
在 Analyzer Name 字段中输入 diacriticFolder。
如果Tokenizer已折叠，请将其展开，然后从下拉列表中选择keyword 。
展开 Token Filters，然后单击 Add token filter。
从下拉列表中选择 icuFolding，然后单击 Add token filter 以将词元过滤器添加到您的自定义分析器中。
单击 Add，将自定义分析器添加到索引。
在 Field Mappings 部分中，单击 Add Field Mapping 以在 Customized Configuration 标签页中的 title 字段上应用自定义分析器。
从Field Name下拉列表中选择title ，并从Data Type下拉列表中选择字符串。
在数据类型的属性部分中，从 Index Analyzer 和 Search Analyzer 下拉菜单中选择 diacriticFolder。
单击 Add（连接）。
再次单击Add Field Mapping为genres字段编制索引。
从Field Name下拉列表中选择genres ，并从Data Type下拉列表中选择字符串。
单击 Add，然后单击 Save Changes。

将默认定义替换为以下内容：

1 {
2   "mappings": {
3     "fields": {
4       "genres": {
5         "type": "string"
6       },
7       "title": {
8         "analyzer": "diacriticFolder",
9         "type": "string"
10       }
11     }
12   },
13   "analyzers": [{
14     "charFilters": [],
15     "name": "diacriticFolder",
16     "tokenizer": {
17       "type": "keyword"
18     },
19     "tokenFilters": [{
20       "type": "icuFolding"
21     }]
22   }]
23 }

单击 Next（连接）。

单击 Create Search Index（保存并关闭）。

关闭 You're All Set!（一切就绪！）模态窗口。

此时将显示一个模态窗口，让您知道索引正在构建中。点击 Close 按钮。

等待索引完成构建。

构建索引大约需要一分钟时间。在构建时，Status 列显示 Build in Progress。构建完成后，Status 列显示 Active。

搜索集合

➤ 使用选择语言下拉菜单设置本节中示例的语言。

您可以使用复合运算符将两个或多个操作符组合成一个查询。本部分中的示例查询使用复合运算符，通过多个运算符查询movies集合中的title和genres字段。

在本部分中，连接到你的 Atlas 集群并使用compound操作符针对sample_mflix.movies collection 运行样本查询。

AtlasGoClusters在Atlas中，Go项目的页面。

如果尚未显示，请从导航栏上的 Organizations 菜单中选择包含所需项目的组织。
如果尚未显示，请从导航栏的Projects菜单中选择所需的项目。
如果尚未出现，请单击侧边栏中的 Clusters（集群）。
会显示集群页面。

转到集群的 Atlas Search 页面。

您可以从侧边栏、 Data Explorer 或集群详细信息页面转到 Atlas Search 页面。

在侧边栏中，单击 Services 标题下的 Atlas Search。
注意
如果没有集群，则请单击 Create cluster 来创建一个。要了解更多信息，请参阅创建集群。
从 Select data source 下拉菜单中选择您的集群并单击 Go to Atlas Search。
将显示 Atlas Search 页面。

单击集群的对应 Browse Collections 按钮。
展开数据库并选择集合。
单击该集合的 Search Indexes 标签页。
将显示 Atlas Search 页面。

单击集群的名称。
单击 Atlas Search 标签页。
将显示 Atlas Search 页面。

转到 Search Tester（搜索测试器）。

单击要查询的索引右侧的 Query 按钮。

查看和编辑查询语法。

单击Edit Query查看 JSON格式的默认查询语法示例。

运行 Atlas Search 不区分变音符号的搜索。

此查询使用$search阶段来查询使用compound操作符的集合。 compound操作符使用以下子句：

must 子句使用通配符操作符搜索以术语 allè开头的电影标题
should 使用文本操作符指定 Drama 类型的偏好的子句

将以下查询复制并粘贴到 Query Editor 中，然后点击 Query Editor 中的 Search 按钮。

1 [
2   {
3     "$search" : {
4       "index": "diacritic-insensitive-tutorial",
5       "compound" : {
6         "must": [{
7             "wildcard" : {
8               "query" : "alle*",
9               "path": "title",
10               "allowAnalyzedField": true
11         }
12         }],
13         "should": [{
14           "text": {
15             "query" : "Drama",
16             "path" : "genres"
17           }
18         }]
19       }
20     }
21   }
22 ]

SCORE: 1.2084882259368896  _id:  "573a13a1f29313caabd07bb6"
  plot: "A group of hip retro teenage outsiders become involved in an interscho…"
  genres:
    0: "Drama"
    1: "Family"
    2: "Sport"
  runtime: 103
  title: "Alley Cats Strike"
SCORE: 1.179288625717163  _id:  "573a13b1f29313caabd382a2"
  plot: "Famous pianist Zetterstrèm returns home to his native Denmark, to give…"
  genres:
    0: "Drama"
    1: "Romance"
    2: "Sci-Fi"
  runtime: 88
  title: "Allegro"
SCORE: 1  _id:  "573a1397f29313caabce5f15"
  plot: "An enthusiastic filmmaker thinks he's come up with a totally original …"
  genres:
    0: "Animation"
    1: "Comedy"
    2: "Fantasy"
  runtime: 75
  title: "Allegro non troppo"
SCORE: 1  _id:  "573a13d1f29313caabd8f84b"
  plot: "The eleven year old cycling talent Freddy is the son of a butcher in a…"
  genres:
    0: "Comedy"
  runtime: 100
  title: "Allez, Eddy!"

展开查询结果。

Search Tester 可能不会显示其所返回文档的所有字段。要查看所有字段，包括在查询路径中指定的字段，请展开结果中的文档。

allè 的通配符搜索返回 title 字段以 alle 开头的文档，即使它不包含任何变音符号，因为我们在 title 字段上使用的 diacriticsFolder 自定义分析器对其值应用了字符折叠。Atlas Search 返回标题以查询词 allè 开头的文档，因为我们使用了关键字分词器，它将整个字符串（或短语）标记为单个词元。

或者，您可以在用于标题字段的自定义分析器中指定标准分词器，而不是关键字分词器。对于标准分词器，Atlas Search 结果将包含标题以查询术语 allè 开头或出现在单词开头任意位置的文档，例如 “Desde allè”。要测试这一点，请编辑您的索引定义，将第 17 行的 keyword 分词器替换为 standard 分词器，保存索引定义，然后运行示例查询。

通过 `mongosh` 连接到您的集群。

在终端窗口中打开mongosh并连接到集群。有关连接的详细说明，请参阅通过mongosh连接。

使用 `sample_mflix` 数据库。

在 mongosh 提示符下运行以下命令：

use sample_mflix

运行 Atlas Search 不区分变音符号的搜索。

此查询使用$search阶段来查询使用compound操作符的集合。 compound操作符使用以下子句：

must 子句使用通配符操作符搜索以术语 allè开头的电影标题
should 使用文本操作符指定 Drama 类型的偏好的子句

查询使用 $project 阶段来执行以下操作：

排除 title 和 genres 之外的所有字段
添加字段 score

1 db.movies.aggregate([
2   {
3     "$search" : {
4       "index": "diacritic-insensitive-tutorial",
5       "compound" : {
6         "must": [{
7             "wildcard" : {
8               "query" : "allè*",
9               "path": "title",
10               "allowAnalyzedField": true
11         }
12         }],
13         "should": [{
14           "text": {
15             "query" : "Drama",
16             "path" : "genres"
17           }
18         }]
19       }
20     }
21   },
22   {
23     "$project" : {
24       "_id" : 0,
25       "title" : 1,
26       "genres" : 1,
27       "score" : { "$meta": "searchScore" }
28     }
29   }
30 ])

{
  genres: [ 'Drama', 'Family', 'Sport' ],
  title: 'Alley Cats Strike',
  score: 1.2084882259368896
},
{
  genres: [ 'Drama', 'Romance', 'Sci-Fi' ],
  title: 'Allegro',
  score: 1.179288625717163
},
{
  genres: [ 'Animation', 'Comedy', 'Fantasy' ],
  title: 'Allegro non troppo',
  score: 1
},
{
  genres: [ 'Comedy' ],
  title: 'Allez, Eddy!',
  score: 1
}

在 MongoDB Compass 中连接到您的集群。

打开 MongoDB Compass 并连接到您的集群。有关连接的详细说明，请参阅通过 Compass 连接。

使用 `sample_mflix` 数据库中的 `movies` 集合。

在 Database 屏幕上，依次单击 sample_mflix 数据库和 movies 集合。

运行 Atlas Search 不区分变音符号的搜索。

此查询使用以下compound操作符子句来查询collection：

must 子句使用通配符操作符搜索以术语 allè开头的电影标题
should 使用文本操作符指定 Drama 类型的偏好的子句

查询使用 $project 阶段来执行以下操作：

排除 title 和 genres 之外的所有字段
添加字段 score

若要在 MongoDB Compass 中运行此查询：

单击 Aggregations 标签页。
单击 Select...，然后从下拉菜单中选择阶段并为该阶段添加查询，以配置以下每个管道阶段。单击 Add Stage 以添加其他阶段。

管道阶段

查询

$search

{
  "index": "diacritic-insensitive-tutorial",
  "compound": {
    "must": [{
      "wildcard": {
        "path": "title",
        "query": "allè*",
        "allowAnalyzedField": true
      }
    }],
    "should": [{
      "text": {
        "query": "Drama",
        "path": "genres"
      }
    }]
  }
}

$project

{
  "_id": 0,
  "title": 1,
  "genres": 1,
  "score": {
    "$meta": "searchScore"
  }
}

如果启用了 Auto Preview，MongoDB Compass 将在 $project 管道阶段旁边显示以下文档：

{
  genres: [ 'Drama', 'Family', 'Sport' ],
  title: 'Alley Cats Strike',
  score: 1.2084882259368896
},
{
  genres: [ 'Drama', 'Romance', 'Sci-Fi' ],
  title: 'Allegro',
  score: 1.179288625717163
},
{
  genres: [ 'Animation', 'Comedy', 'Fantasy' ],
  title: 'Allegro non troppo',
  score: 1
},
{
  genres: [ 'Comedy' ],
  title: 'Allez, Eddy!',
  score: 1
}

为查询设置并初始化 .NET/C# 项目。

创建一个名为 diacritic-insensitive-example 的新目录，并使用 dotnet new 命令初始化项目。
```
mkdir diacritic-insensitive-example
cd diacritic-insensitive-example
dotnet new console
```
将 .NET/C# 驱动程序作为依赖项添加到项目中。
```
dotnet add package MongoDB.Driver
```

在 `Program.cs` 文件中创建查询。

将Program.cs文件的内容替换为以下代码。

此代码示例将执行以下任务：

导入mongodb包和依赖项。
建立与您的 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
遍历游标以打印与查询匹配的文档。

1 using MongoDB.Bson;
2 using MongoDB.Bson.Serialization.Attributes;
3 using MongoDB.Bson.Serialization.Conventions;
4 using MongoDB.Driver;
5 using MongoDB.Driver.Search;
6 
7 public class DiacriticInsensitiveExample
8 {
9     private const string MongoConnectionString = "<connection-string>";
10 
11     public static void Main(string[] args)
12     {
13         // allow automapping of the camelCase database fields to our MovieDocument
14         var camelCaseConvention = new ConventionPack { new CamelCaseElementNameConvention() };
15         ConventionRegistry.Register("CamelCase", camelCaseConvention, type => true);
16 
17         // connect to your Atlas cluster
18         var mongoClient = new MongoClient(MongoConnectionString);
19         var mflixDatabase = mongoClient.GetDatabase("sample_mflix");
20         var moviesCollection = mflixDatabase.GetCollection<MovieDocument>("movies");
21 
22         // define and run pipeline
23         var results = moviesCollection.Aggregate()
24             .Search(Builders<MovieDocument>.Search.Compound()
25                 .Must(Builders<MovieDocument>.Search.Wildcard(movie => movie.Title, "allè*", true))
26                 .Should(Builders<MovieDocument>.Search.Text(movie => movie.Genres, "Drama")),
27              indexName: "diacritic-insensitive-tutorial")
28             .Project<MovieDocument>(Builders<MovieDocument>.Projection
29                 .Include(movie => movie.Title)
30                 .Include(movie => movie.Genres)
31                 .Exclude(movie => movie.Id)
32                 .MetaSearchScore(movie => movie.Score))
33             .ToList();
34 
35         // print results
36         foreach (var movie in results)
37         {
38             Console.WriteLine(movie.ToJson());
39         }
40     }
41 }
42 
43 [BsonIgnoreExtraElements]
44 public class MovieDocument
45 {
46     [BsonIgnoreIfDefault]
47     public ObjectId Id { get; set; }
48     public string [] Genres { get; set; }
49     public string Title { get; set; }
50     public double Score { get; set; }
51 }

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

编译并运行 `Program.cs` 文件。

dotnet run diacritic-insensitive-example.csproj

{ "genres" : ["Drama", "Family", "Sport"], "title" : "Alley Cats Strike", "score" : 1.2084882259368896 }
{ "genres" : ["Drama", "Romance", "Sci-Fi"], "title" : "Allegro", "score" : 1.1792886257171631 }
{ "genres" : ["Animation", "Comedy", "Fantasy"], "title" : "Allegro non troppo", "score" : 1.0 }
{ "genres" : ["Comedy"], "title" : "Allez, Eddy!", "score" : 1.0 }

运行 Atlas Search 不区分变音符号的搜索。

创建一个名为 diacritic-insensitive.go 的文件。

将以下代码复制并粘贴到 diacritic-insensitive.go 文件。

此代码示例将执行以下任务：

导入mongodb包和依赖项。
建立与您的 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
遍历游标以打印与查询匹配的文档。

1 package main
2 
3 import (
4 	"context"
5 	"fmt"
6 
7 	"go.mongodb.org/mongo-driver/bson"
8 	"go.mongodb.org/mongo-driver/mongo"
9 	"go.mongodb.org/mongo-driver/mongo/options"
10 )
11 
12 func main() {
13 	// connect to your Atlas cluster
14 	client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI("<connection-string>"))
15 	if err != nil {
16 		panic(err)
17 	}
18 	defer client.Disconnect(context.TODO())
19 
20 	// set namespace
21 	collection := client.Database("sample_mflix").Collection("movies")
22 
23 	// define pipeline stages
24 	searchStage := bson.D{{"$search", bson.M{
25 		"index": "diacritic-insensitive-tutorial",
26 		"compound": bson.M{
27 			"must": bson.M{
28 				"wildcard": bson.M{
29 					"path":               "title",
30 					"query":              "allè*",
31 					"allowAnalyzedField": true,
32 				},
33 			},
34 			"should": bson.D{
35 				{"text", bson.M{
36 					"path":  "genres",
37 					"query": "Drama"}}},
38 		},
39 	}}}
40 	projectStage := bson.D{{"$project", bson.D{{"title", 1}, {"genres", 1}, {"_id", 0}, {"score", bson.D{{"$meta", "searchScore"}}}}}}
41 
42 	// run pipeline
43 	cursor, err := collection.Aggregate(context.TODO(), mongo.Pipeline{searchStage, projectStage})
44 	if err != nil {
45 		panic(err)
46 	}
47 
48 	// print results
49 	var results []bson.D
50 	if err = cursor.All(context.TODO(), &results); err != nil {
51 		panic(err)
52 	}
53 	for _, result := range results {
54 		fmt.Println(result)
55 	}
56 }

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

运行以下命令来查询您的集合：

go run diacritic-insensitive.go

[{genres [Drama Family Sport]} {title Alley Cats Strike} {score 1.2084882259368896}]
[{genres [Drama Romance Sci-Fi]} {title Allegro} {score 1.179288625717163}]
[{genres [Animation Comedy Fantasy]} {title Allegro non troppo} {score 1}]
[{genres [Comedy]} {title Allez, Eddy!} {score 1}]

确保 `CLASSPATH` 包含以下库。

`junit`	4.11 或更高版本
`mongodb-driver-sync`	4.3.0 或更高版本
`slf4j-log4j12`	1.7.30 或更高版本

运行 Atlas Search 不区分变音符号的搜索。

创建一个名为 DiacriticInsensitive.java 的文件。

将以下代码复制并粘贴到 DiacriticInsensitive.java 文件。

此代码示例将执行以下任务：

导入mongodb包和依赖项。
建立与您的 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
遍历游标以打印与查询匹配的文档。

1 import static com.mongodb.client.model.Aggregates.project;
2 import static com.mongodb.client.model.Projections.*;
3 import com.mongodb.client.MongoClient;
4 import com.mongodb.client.MongoClients;
5 import com.mongodb.client.MongoCollection;
6 import com.mongodb.client.MongoDatabase;
7 import org.bson.Document;
8 import java.util.Arrays;
9 import java.util.List;
10 
11 public class tutorial {
12     public static void main(String[] args) {
13         // define clauses
14         List<Document> mustClauses =
15             List.of( new Document("wildcard", 
16                 new Document("path", "title")
17                 .append("query", "allè*")
18                 .append("allowAnalyzedField", true)));
19         List<Document> shouldClauses =
20             List.of( new Document("text",
21                 new Document("query", "Drama")
22                 .append("path", "genres")));
23         // define pipeline
24         Document agg = new Document( "$search",
25             new Document("index", "diacritic-insensitive-tutorial")
26             .append("compound",
27                 new Document("must", mustClauses)
28                 .append("should", shouldClauses)));
29 
30         // connect to your Atlas cluster
31         String uri = "<connection-string>";
32 
33         try (MongoClient mongoClient = MongoClients.create(uri)) {            
34             // set namespace
35             MongoDatabase database = mongoClient.getDatabase("sample_mflix");
36             MongoCollection<Document> collection = database.getCollection("movies");
37             
38             // run pipeline and print results
39             collection.aggregate(Arrays.asList(agg,
40                 project(fields(
41                     excludeId(), 
42                     include("title"), 
43                     include("genres"), 
44                     computed("score", new Document("$meta", "searchScore"))))))
45                 .forEach(doc -> System.out.println(doc.toJson()));
46         }
47     }
48 }

注意

要在 Maven 环境中运行示例代码，请将以下代码添加到文件中的 import 语句上方。

package com.mongodb.drivers;

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

编译并运行DiacriticInsensitive.java文件。

javac DiacriticInsensitive.java
java DiacriticInsensitive

{"genres": ["Drama", "Family", "Sport"], "title": "Alley Cats Strike", "score": 1.2084882259368896}
{"genres": ["Drama", "Romance", "Sci-Fi"], "title": "Allegro", "score": 1.179288625717163}
{"genres": ["Animation", "Comedy", "Fantasy"], "title": "Allegro non troppo", "score": 1.0}
{"genres": ["Comedy"], "title": "Allez, Eddy!", "score": 1.0}

确保将以下依赖项添加到项目中。

`mongodb-driver-kotlin-coroutine`	4.10.0 或更高版本

运行 Atlas Search 不区分变音符号的搜索。

创建一个名为 DiacriticInsensitive.kt 的文件。

将以下代码复制并粘贴到 DiacriticInsensitive.kt 文件。

此代码示例将执行以下任务：

导入mongodb包和依赖项。
建立与您的 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
打印与 AggregateFlow 实例中的查询相匹配的文档。

1 import com.mongodb.client.model.Aggregates.project
2 import com.mongodb.client.model.Projections.*
3 import com.mongodb.kotlin.client.coroutine.MongoClient
4 import kotlinx.coroutines.runBlocking
5 import org.bson.Document
6 
7 fun main() {
8     // connect to your Atlas cluster
9     val uri = "<connection-string>"
10     val mongoClient = MongoClient.create(uri)
11 
12     // set namespace
13     val database = mongoClient.getDatabase("sample_mflix")
14     val collection = database.getCollection<Document>("movies")
15 
16     runBlocking {
17         // define clauses
18         val mustClauses = listOf(
19             Document(
20                 "wildcard",
21                 Document("path", "title")
22                     .append("query", "allè*")
23                     .append("allowAnalyzedField", true)
24             )
25         )
26 
27         val shouldClauses = listOf(
28             Document(
29                 "text",
30                 Document("query", "Drama")
31                     .append("path", "genres")
32             )
33         )
34 
35         // define pipeline
36         val agg = Document( "\$search",
37             Document("index", "diacritic-insensitive-tutorial")
38                 .append("compound", Document("must", mustClauses)
39                     .append("should", shouldClauses)
40                 )
41         )
42 
43         // run pipeline and print results
44         val resultsFlow = collection.aggregate<Document>(
45             listOf(
46                 agg,
47                 project(fields(
48                     excludeId(),
49                     include("title", "genres"),
50                     computed("score", Document("\$meta", "searchScore"))))
51             )
52         )
53         resultsFlow.collect { println(it) }
54     }
55 
56     mongoClient.close()
57 }

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

运行 DiacriticInsensitive.kt 文件。

当你在 IDE 中运行 DiacriticInsensitive.kt 程序时，它会打印以下文档：

Document{{genres=[Drama, Family, Sport], title=Alley Cats Strike, score=1.2084882259368896}}
Document{{genres=[Drama, Romance, Sci-Fi], title=Allegro, score=1.179288625717163}}
Document{{genres=[Animation, Comedy, Fantasy], title=Allegro non troppo, score=1.0}}
Document{{genres=[Comedy], title=Allez, Eddy!, score=1.0}}

运行 Atlas Search 不区分变音符号的搜索。

创建一个名为 diacritic-insensitive.js 的文件。

将以下代码复制并粘贴到 diacritic-insensitive.js 文件。

此代码示例将执行以下任务：

导入 mongodb，即 MongoDB 的 Node.js 驱动程序。
创建一个 MongoClient 类实例，以建立与 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
遍历游标以打印与查询匹配的文档。

1 const { MongoClient } = require("mongodb");
2 
3 // Replace the uri string with your MongoDB deployment's connection string.
4 const uri =
5   "<connection-string>";
6 
7 const client = new MongoClient(uri);
8 
9 async function run() {
10   try {
11     await client.connect();
12 
13     // set namespace
14     const database = client.db("sample_mflix");
15     const coll = database.collection("movies");
16 
17     // define pipeline
18     const agg = [{
19         '$search': {
20           'index': 'diacritic-insensitive-tutorial',
21           'compound': {
22                 'must': [{
23                     'wildcard': {
24                         'query': "allè*",
25                         'path': "title",
26                         'allowAnalyzedField': true
27                     }
28                 }],
29                 'should': [{'text': {'query': 'Drama', 'path': 'genres'}}]
30             }}},
31         { '$project': { '_id': 0, 'title': 1 , 'genres': 1, 'score': {'$meta': 'searchScore'}}}];
32            
33     // run pipeline
34     const result = await coll.aggregate(agg);
35 
36     // print results
37     await result.forEach((doc) => console.log(doc));
38     
39   } finally {
40     await client.close();
41   }
42 }
43 run().catch(console.dir);

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

运行以下命令来查询您的集合：

node diacritic-insensitive.js

{
  genres: [ 'Drama', 'Family', 'Sport' ],
  title: 'Alley Cats Strike',
  score: 1.2084882259368896
}
{
  genres: [ 'Drama', 'Romance', 'Sci-Fi' ],
  title: 'Allegro',
  score: 1.179288625717163
}
{
  genres: [ 'Animation', 'Comedy', 'Fantasy' ],
  title: 'Allegro non troppo',
  score: 1
}
{
  genres: [ 'Comedy' ],
  title: 'Allez, Eddy!',
  score: 1
}

运行 Atlas Search 不区分变音符号的搜索。

创建一个名为 diacritic-insensitive.py 的文件。

将以下代码复制并粘贴到 diacritic-insensitive.py 文件。

以下代码示例：

导入 pymongo 、MongoDB 的 Python 驱动程序和 dns 模块，这是使用 DNS 种子列表连接字符串将 pymongo 连接到 Atlas 所必需的。
创建一个 MongoClient 类实例，以建立与 Atlas 集群的连接。
使用以下 compound 操作符子句对该集合进行查询：
- must 子句使用通配符操作符搜索以术语 allè开头的电影标题
- should 使用文本操作符指定 Drama 类型的偏好的子句
查询使用 $project 阶段来执行以下操作：
- 排除 title 和 genres 之外的所有字段
- 添加字段 score
遍历游标以打印与查询匹配的文档。

1 import pymongo
2 
3 # connect to your Atlas cluster
4 client = pymongo.MongoClient('<connection-string>')
5 
6 # define pipeline
7 pipeline = [
8   {'$search': {
9       'index': 'diacritic-insensitive-tutorial',
10       'compound': {
11         'must': [{'wildcard': {'path': 'title', 'query': 'allè*', 'allowAnalyzedField': True}}],
12         'should': [{'text': {'query': 'Drama', 'path': 'genres'}}]}}},
13   {'$project': {'_id': 0, 'title': 1, 'genres': 1, 'score': {'$meta': 'searchScore'}}}
14 ]
15 
16 # run pipeline
17 result = client['sample_mflix']['movies'].aggregate(pipeline)
18 
19 # print results
20 for i in result:
21     print(i)

在运行示例之前，请将 <connection-string> 替换为 Atlas 连接字符串。确保您的连接字符串包含数据库用户的档案。要了解详情，请参阅通过驱动程序连接。

运行以下命令来查询您的集合：

python diacritic-insensitive.py

{'genres': ['Drama', 'Family', 'Sport'], 'title': 'Alley Cats Strike', 'score': 1.2084882259368896}
{'genres': ['Drama', 'Romance', 'Sci-Fi'], 'title': 'Allegro', 'score': 1.179288625717163}
{'genres': ['Animation', 'Comedy', 'Fantasy'], 'title': 'Allegro non troppo', 'score': 1.0}
{'genres': ['Comedy'], 'title': 'Allez, Eddy!', 'score': 1.0}

后退

所有结果

来年

如何运行 Atlas Search 复合地理 JSON 查询

1	{
2	"mappings": {
3	"fields": {
4	"genres": {
5	"type": "string"
6	},
7	"title": {
8	"analyzer": "diacriticFolder",
9	"type": "string"
10	}
11	}
12	},
13	"analyzers": [{
14	"charFilters": [],
15	"name": "diacriticFolder",
16	"tokenizer": {
17	"type": "keyword"
18	},
19	"tokenFilters": [{
20	"type": "icuFolding"
21	}]
22	}]
23	}

1	[
2	{
3	"$search" : {
4	"index": "diacritic-insensitive-tutorial",
5	"compound" : {
6	"must": [{
7	"wildcard" : {
8	"query" : "alle*",
9	"path": "title",
10	"allowAnalyzedField": true
11	}
12	}],
13	"should": [{
14	"text": {
15	"query" : "Drama",
16	"path" : "genres"
17	}
18	}]
19	}
20	}
21	}
22	]

1	db.movies.aggregate([
2	{
3	"$search" : {
4	"index": "diacritic-insensitive-tutorial",
5	"compound" : {
6	"must": [{
7	"wildcard" : {
8	"query" : "allè*",
9	"path": "title",
10	"allowAnalyzedField": true
11	}
12	}],
13	"should": [{
14	"text": {
15	"query" : "Drama",
16	"path" : "genres"
17	}
18	}]
19	}
20	}
21	},
22	{
23	"$project" : {
24	"_id" : 0,
25	"title" : 1,
26	"genres" : 1,
27	"score" : { "$meta": "searchScore" }
28	}
29	}
30	])

1	using MongoDB.Bson;
2	using MongoDB.Bson.Serialization.Attributes;
3	using MongoDB.Bson.Serialization.Conventions;
4	using MongoDB.Driver;
5	using MongoDB.Driver.Search;
6
7	public class DiacriticInsensitiveExample
8	{
9	private const string MongoConnectionString = "<connection-string>";
10
11	public static void Main(string[] args)
12	{
13	// allow automapping of the camelCase database fields to our MovieDocument
14	var camelCaseConvention = new ConventionPack { new CamelCaseElementNameConvention() };
15	ConventionRegistry.Register("CamelCase", camelCaseConvention, type => true);
16
17	// connect to your Atlas cluster
18	var mongoClient = new MongoClient(MongoConnectionString);
19	var mflixDatabase = mongoClient.GetDatabase("sample_mflix");
20	var moviesCollection = mflixDatabase.GetCollection<MovieDocument>("movies");
21
22	// define and run pipeline
23	var results = moviesCollection.Aggregate()
24	.Search(Builders<MovieDocument>.Search.Compound()
25	.Must(Builders<MovieDocument>.Search.Wildcard(movie => movie.Title, "allè*", true))
26	.Should(Builders<MovieDocument>.Search.Text(movie => movie.Genres, "Drama")),
27	indexName: "diacritic-insensitive-tutorial")
28	.Project<MovieDocument>(Builders<MovieDocument>.Projection
29	.Include(movie => movie.Title)
30	.Include(movie => movie.Genres)
31	.Exclude(movie => movie.Id)
32	.MetaSearchScore(movie => movie.Score))
33	.ToList();
34
35	// print results
36	foreach (var movie in results)
37	{
38	Console.WriteLine(movie.ToJson());
39	}
40	}
41	}
42
43	[BsonIgnoreExtraElements]
44	public class MovieDocument
45	{
46	[BsonIgnoreIfDefault]
47	public ObjectId Id { get; set; }
48	public string [] Genres { get; set; }
49	public string Title { get; set; }
50	public double Score { get; set; }
51	}

1	package main
2
3	import (
4	"context"
5	"fmt"
6
7	"go.mongodb.org/mongo-driver/bson"
8	"go.mongodb.org/mongo-driver/mongo"
9	"go.mongodb.org/mongo-driver/mongo/options"
10	)
11
12	func main() {
13	// connect to your Atlas cluster
14	client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI("<connection-string>"))
15	if err != nil {
16	panic(err)
17	}
18	defer client.Disconnect(context.TODO())
19
20	// set namespace
21	collection := client.Database("sample_mflix").Collection("movies")
22
23	// define pipeline stages
24	searchStage := bson.D{{"$search", bson.M{
25	"index": "diacritic-insensitive-tutorial",
26	"compound": bson.M{
27	"must": bson.M{
28	"wildcard": bson.M{
29	"path": "title",
30	"query": "allè*",
31	"allowAnalyzedField": true,
32	},
33	},
34	"should": bson.D{
35	{"text", bson.M{
36	"path": "genres",
37	"query": "Drama"}}},
38	},
39	}}}
40	projectStage := bson.D{{"$project", bson.D{{"title", 1}, {"genres", 1}, {"_id", 0}, {"score", bson.D{{"$meta", "searchScore"}}}}}}
41
42	// run pipeline
43	cursor, err := collection.Aggregate(context.TODO(), mongo.Pipeline{searchStage, projectStage})
44	if err != nil {
45	panic(err)
46	}
47
48	// print results
49	var results []bson.D
50	if err = cursor.All(context.TODO(), &results); err != nil {
51	panic(err)
52	}
53	for _, result := range results {
54	fmt.Println(result)
55	}
56	}

1	import static com.mongodb.client.model.Aggregates.project;
2	import static com.mongodb.client.model.Projections.*;
3	import com.mongodb.client.MongoClient;
4	import com.mongodb.client.MongoClients;
5	import com.mongodb.client.MongoCollection;
6	import com.mongodb.client.MongoDatabase;
7	import org.bson.Document;
8	import java.util.Arrays;
9	import java.util.List;
10
11	public class tutorial {
12	public static void main(String[] args) {
13	// define clauses
14	List<Document> mustClauses =
15	List.of( new Document("wildcard",
16	new Document("path", "title")
17	.append("query", "allè*")
18	.append("allowAnalyzedField", true)));
19	List<Document> shouldClauses =
20	List.of( new Document("text",
21	new Document("query", "Drama")
22	.append("path", "genres")));
23	// define pipeline
24	Document agg = new Document( "$search",
25	new Document("index", "diacritic-insensitive-tutorial")
26	.append("compound",
27	new Document("must", mustClauses)
28	.append("should", shouldClauses)));
29
30	// connect to your Atlas cluster
31	String uri = "<connection-string>";
32
33	try (MongoClient mongoClient = MongoClients.create(uri)) {
34	// set namespace
35	MongoDatabase database = mongoClient.getDatabase("sample_mflix");
36	MongoCollection<Document> collection = database.getCollection("movies");
37
38	// run pipeline and print results
39	collection.aggregate(Arrays.asList(agg,
40	project(fields(
41	excludeId(),
42	include("title"),
43	include("genres"),
44	computed("score", new Document("$meta", "searchScore"))))))
45	.forEach(doc -> System.out.println(doc.toJson()));
46	}
47	}
48	}

1	import com.mongodb.client.model.Aggregates.project
2	import com.mongodb.client.model.Projections.*
3	import com.mongodb.kotlin.client.coroutine.MongoClient
4	import kotlinx.coroutines.runBlocking
5	import org.bson.Document
6
7	fun main() {
8	// connect to your Atlas cluster
9	val uri = "<connection-string>"
10	val mongoClient = MongoClient.create(uri)
11
12	// set namespace
13	val database = mongoClient.getDatabase("sample_mflix")
14	val collection = database.getCollection<Document>("movies")
15
16	runBlocking {
17	// define clauses
18	val mustClauses = listOf(
19	Document(
20	"wildcard",
21	Document("path", "title")
22	.append("query", "allè*")
23	.append("allowAnalyzedField", true)
24	)
25	)
26
27	val shouldClauses = listOf(
28	Document(
29	"text",
30	Document("query", "Drama")
31	.append("path", "genres")
32	)
33	)
34
35	// define pipeline
36	val agg = Document( "\$search",
37	Document("index", "diacritic-insensitive-tutorial")
38	.append("compound", Document("must", mustClauses)
39	.append("should", shouldClauses)
40	)
41	)
42
43	// run pipeline and print results
44	val resultsFlow = collection.aggregate<Document>(
45	listOf(
46	agg,
47	project(fields(
48	excludeId(),
49	include("title", "genres"),
50	computed("score", Document("\$meta", "searchScore"))))
51	)
52	)
53	resultsFlow.collect { println(it) }
54	}
55
56	mongoClient.close()
57	}

1	const { MongoClient } = require("mongodb");
2
3	// Replace the uri string with your MongoDB deployment's connection string.
4	const uri =
5	"<connection-string>";
6
7	const client = new MongoClient(uri);
8
9	async function run() {
10	try {
11	await client.connect();
12
13	// set namespace
14	const database = client.db("sample_mflix");
15	const coll = database.collection("movies");
16
17	// define pipeline
18	const agg = [{
19	'$search': {
20	'index': 'diacritic-insensitive-tutorial',
21	'compound': {
22	'must': [{
23	'wildcard': {
24	'query': "allè*",
25	'path': "title",
26	'allowAnalyzedField': true
27	}
28	}],
29	'should': [{'text': {'query': 'Drama', 'path': 'genres'}}]
30	}}},
31	{ '$project': { '_id': 0, 'title': 1 , 'genres': 1, 'score': {'$meta': 'searchScore'}}}];
32
33	// run pipeline
34	const result = await coll.aggregate(agg);
35
36	// print results
37	await result.forEach((doc) => console.log(doc));
38
39	} finally {
40	await client.close();
41	}
42	}
43	run().catch(console.dir);

1	import pymongo
2
3	# connect to your Atlas cluster
4	client = pymongo.MongoClient('<connection-string>')
5
6	# define pipeline
7	pipeline = [
8	{'$search': {
9	'index': 'diacritic-insensitive-tutorial',
10	'compound': {
11	'must': [{'wildcard': {'path': 'title', 'query': 'allè*', 'allowAnalyzedField': True}}],
12	'should': [{'text': {'query': 'Drama', 'path': 'genres'}}]}}},
13	{'$project': {'_id': 0, 'title': 1, 'genres': 1, 'score': {'$meta': 'searchScore'}}}
14	]
15
16	# run pipeline
17	result = client['sample_mflix']['movies'].aggregate(pipeline)
18
19	# print results
20	for i in result:
21	print(i)

创建 Atlas Search 索引

AtlasGo.css-h15tq0{font-style:normal;font-weight:700;}Clusters在Atlas中，Go项目的 页面。

转到集群的 Atlas Search 页面。

注意

单击 Create Search Index（保存并关闭）。

开始您的索引配置。

指定索引定义。

单击 Create Search Index（保存并关闭）。

关闭 You're All Set!（一切就绪！）模态窗口。

等待索引完成构建。

搜索集合

AtlasGoClusters在Atlas中，Go项目的 页面。

转到集群的 Atlas Search 页面。

注意

转到 Search Tester（搜索测试器）。

查看和编辑查询语法。

运行 Atlas Search 不区分变音符号的搜索。

展开查询结果。

通过 mongosh 连接到您的集群。

使用 sample_mflix 数据库。

运行 Atlas Search 不区分变音符号的搜索。

在 MongoDB Compass 中连接到您的集群。

使用 sample_mflix 数据库中的 movies 集合。

运行 Atlas Search 不区分变音符号的搜索。

为查询设置并初始化 .NET/C# 项目。

在 Program.cs 文件中创建查询。

编译并运行 Program.cs 文件。

运行 Atlas Search 不区分变音符号的搜索。

确保 CLASSPATH 包含以下库。

运行 Atlas Search 不区分变音符号的搜索。

注意

确保将以下依赖项添加到项目中。

运行 Atlas Search 不区分变音符号的搜索。

运行 Atlas Search 不区分变音符号的搜索。

运行 Atlas Search 不区分变音符号的搜索。

AtlasGoClusters在Atlas中，Go项目的页面。

AtlasGoClusters在Atlas中，Go项目的页面。

通过 `mongosh` 连接到您的集群。

使用 `sample_mflix` 数据库。

使用 `sample_mflix` 数据库中的 `movies` 集合。

在 `Program.cs` 文件中创建查询。

编译并运行 `Program.cs` 文件。

确保 `CLASSPATH` 包含以下库。