Atlas Search で $text クエリを更新し、検索パフォーマンスを向上させます

項目一覧

Atlas Search 機能の利点

例
詳細

クエリが $text集計パイプラインステージに大きく依存している場合は、代わりに $searchを使用するようにこれらのクエリを変更して、クエリの柔軟性とパフォーマンスの両方を向上させることができます。

Atlas Search 機能の利点

$search集計ステージは次の機能を提供します。これらは$text演算子では使用できない、使用できてもパフォーマンスは低下する、またはユーザーによる実装作業がソートされた場合にのみ使用できます。

言語の認識
大文字と小文字を区別しない、発音区別符号を区別しない検索
結果テキストの強調表示
地理空間対応クエリ
異なるトークン化戦略を持つ文字と単語の完了
あいまいな一致
複合演算子を使用した 10 個以上の文字列のフィルタリング
カスタマイズ可能な関連性のスコアリングと並べ替え
配列に対する単一の複合インデックス
シノニム検索（同意語検索）
ファセットナビゲーションのバケット化
カスタムアナライザ
部分一致
フレーズクエリ

例

インデックスの作成

次のセクションの例では、サンプルデータのsample_mflix.moviesコレクションに対するクエリを使用して、Atlas Search が$textと比較して柔軟性とパフォーマンスが向上していることを示します。次のインデックスを使用して、両方の例からクエリを実行できます。

Text Index

Atlas Search インデックス

db.movies.createIndex(
  {
    genres: "text",
    plot: "text",
    year: -1
  }
)

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "genres": {
        "type": "string"
      },
      "plot": {
        "type": "string"
      },
      "year": {
        "type": "number"
      }
    }
  }
}

どちらのインデックス定義でも、 genresフィールドとplotフィールドはテキストとして、 yearフィールドは数値としてインデックス化されます。 $textインデックスの作成手順については、「テキストインデックスの作成」を参照してください。 Atlas Search インデックスの作成手順については、「 Atlas Search インデックスの作成」を参照してください。

Atlas Search による全文クエリの柔軟性の向上

$textベースのクエリをアップデートして$searchを使用すると、柔軟性と便利性が向上します。この例では、サンプルデータのsample_mflix.moviesコレクションをクエリして、年順に昇順にソートされたplotフィールドにという単語を含むエントリを検索します。

前のセクションで説明されたインデックス定義は、 $searchの柔軟性の向上の 1 つを示しています。 sample_mflix.moviesで$textインデックスを作成するには、まずサンプルデータ上の既存のテキストインデックスを削除する必要があります。MongoDB はコレクションあたり 1 つのテキストインデックスのみ。

対照的に、1 つのコレクションに対して複数のAtlas Search インデックスを作成できるため、アプリケーションは個別の全文クエリを並行して活用できます。

次のクエリでは、 plotフィールドに「poet」を持つ最新の 5 つの映画が返され、そのタイトル、ジャンル、プロット、公開年数が表示されます。

Regex Index

Atlas Search インデックス

db.movies.find(
   {
     $text: { $search: "poet" }
   },
   {
     _id: 0,
     title: 1,
     genres: 1,
     plot: 1,
     year: 1
   }
).limit(5)

db.movies.aggregate([
   {
     "$search": {
       "text": {
         "path": "plot",
         "query": "poet"
       }
     }
   },
   {
     "$limit": 5
   },
   {
     "$project": {
       "_id": 0,
       "title": 1,
       "genres": 1,
       "plot": 1,
       "year": 1,
     }
   }
])

これらのクエリではどちらも次の結果が返されます。

{
 plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
 genres: [ 'Crime', 'Drama' ],
 title: 'Who Killed Pasolini?',
 year: 1995
},
{
 plot: 'Friendship and betrayal between two poets during the French Revolution.',
 genres: [ 'Biography', 'Drama' ],
 title: 'Pandaemonium',
 year: 2000
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 plot: 'A love-struck Italian poet is stuck in Iraq at the onset of an American invasion.',
 genres: [ 'Comedy', 'Drama', 'Romance' ],
 title: 'The Tiger and the Snow',
 year: 2005
}

Atlas Search とは異なり、結果にハイライトを追加して、結果が見つかったコンテキストで一致を表示できます。そのためには、上記の Atlas Search クエリを以下のように置き換えます。

1 db.movies.aggregate([
2   {
3     "$search": {
4       "text": {
5         "path": "plot",
6         "query": "poet"
7       },
8       "highlight": {
9         "path": "plot"
10       }
11     }
12   },
13   {
14     "$limit": 1
15   },
16   {
17     "$project": {
18       "_id": 0,
19       "title": 1,
20       "genres": 1,
21       "plot": 1,
22       "year": 1,
23       "highlights": { "$meta": "searchHighlights" }
24     }
25   }
26 ])

上記のクエリの結果には、すべての一致が発生したコンテキストとそれぞれの関連性スコアの両方が含まれるhighlightsフィールドが含まれます。たとえば、次の例では、 $searchの結果の最初のドキュメントのhighlightsフィールドが表示されています。

{
  plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
  genres: [ 'Crime', 'Drama' ],
  title: 'Who Killed Pasolini?',
  year: 1995,
  highlights: [
    {
      score: 1.0902210474014282,
      path: 'plot',
      texts: [
        { value: "It's the story of the murder of a ", type: 'text' },
        { value: 'poet', type: 'hit' },
        {
          value: ', a man, a great film director: Pier Paolo Pasolini. ',
          type: 'text'
        }
      ]
    },
    {
      score: 1.0202842950820923,
      path: 'plot',
      texts: [
        {
          value: 'The story begin with the arrest of "Pelosi", a young man then accused of the murder of the ',
          type: 'text'
        },
        { value: 'poet', type: 'hit' },
        { value: '. ...', type: 'text' }
      ]
    }
  ]
}

Atlas Search を使用したクエリのパフォーマンス向上

Atlas Search は柔軟性と便利性に加えて、類似の$textクエリと比較してパフォーマンス上の大きな利点を提供します。 sample_mflix.moviesコレクションに対するクエリを検討して、 2000と2010 plot間にリリースされた映画を検索します。

次のクエリを実行します。

Text Index

Atlas Search インデックス

db.movies.aggregate([
  {
    $match: {
      year: {$gte: 2000, $lte: 2010},
      $text: { $search: "poet" },
      genres : { $eq: "Comedy" }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    },
  }
])

db.movies.aggregate([
  {
    "$search": {
      "compound": {
        "filter": [{
          "range": {
            "gte": 2000,
            "lte": 2010,
            "path": "year"
          }
        },
        {
          "text": {
            "path": "plot",
            "query": "poet"
          }
        },
        {
          "text": {
            "path": "genres",
            "query": "comedy"
          }
        }]
      }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    }
  }
])

これらのクエリではどちらも次の 3 つのドキュメントが返されます。

   {
  year: 2000,
  plot: 'A film poem inspired by the Peruvian poet Cèsar Vallejo. A story about our need for love, our confusion, greatness and smallness and, most of all, our vulnerability. It is a story with many...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Songs from the Second Floor'
},
{
  plot: 'When his mother, who has sheltered him his entire 40 years, dies, Elling, a sensitive, would-be poet, is sent to live in a state institution. There he meets Kjell Bjarne, a gentle giant and...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Elling',
  year: 2001
},
{
  plot: 'Heart-broken after several affairs, a woman finds herself torn between a Poet and a TV Host.',
  genres: [ 'Comedy', 'Romance', 'Drama' ],
  title: 'Easy',
  year: 2003
}

このような限定的な検索には$textが適していますが、データセットのサイズとクエリの範囲が増加するにつれて、 $searchのパフォーマンス上の利点により、アプリケーションの応答性が大幅に向上します。 $search 集計パイプラインステージではクエリAtlas Search を使用することをお勧めします。

詳細

Atlas Search クエリの詳細については、「 Atlas Search クエリの作成と実行」を参照してください。
MongoDB University では、MongoDB パフォーマンスの最適化に関する無料コースを提供しています。詳しくは、「モニタリングとインサイト」を参照してください。

戻る

シノニム

正規表現を避ける

1	db.movies.aggregate([
2	{
3	"$search": {
4	"text": {
5	"path": "plot",
6	"query": "poet"
7	},
8	"highlight": {
9	"path": "plot"
10	}
11	}
12	},
13	{
14	"$limit": 1
15	},
16	{
17	"$project": {
18	"_id": 0,
19	"title": 1,
20	"genres": 1,
21	"plot": 1,
22	"year": 1,
23	"highlights": { "$meta": "searchHighlights" }
24	}
25	}
26	])