검색 성능 향상을 위해 Atlas Search로 $text 쿼리 업데이트하기

이 페이지의 내용

Atlas Search 검색 이점
예시
자세히 알아보기

쿼리가 $text 집계 파이프라인 단계에 크게 의존하는 경우 $search 를 대신 사용하도록 이러한 쿼리를 수정하여 쿼리의 유연성과 성능을 모두 개선할 수 있습니다.

Atlas Search 검색 이점

$search 애그리게이션 단계에서는 $text 연산자를 통해 사용할 수 없거나, 사용 가능하지만 성능이 저하되거나, 사용자의 중요한 구현 작업을 통해서만 사용할 수 있는 다음과 같은 기능을 제공합니다.

언어 인식
대소문자를 구분하지 않는 검색 및 발음 부호를 구분하지 않는 검색
결과 텍스트 강조 표시
지리 공간 인식 쿼리
다양한 토큰화 전략을 사용한 문자 및 단어 자동 완성
퍼지 매칭
복합 연산자 를 사용하여 10개 이상의 문자열 필터링
사용자 지정 가능한 관련성 점수 및 정렬
배열의 단일 복합 인덱스
동의어 검색
패싯 탐색을 위한 버킷팅
사용자 지정 분석기
부분 매칭
구문 쿼리

예시

인덱스 만들기

다음 섹션의 예제에서는 샘플 데이터 의 sample_mflix.movies 컬렉션 에 대한 쿼리를 사용하여 Atlas Search 가 $text 에 제공하는 유연성 및 성능 개선 사항을 설명합니다. 다음 인덱스를 사용하여 두 예제에서 쿼리를 실행 수 있습니다.

Text Index

Atlas Search 인덱스

db.movies.createIndex(
  {
    genres: "text",
    plot: "text",
    year: -1
  }
)

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "genres": {
        "type": "string"
      },
      "plot": {
        "type": "string"
      },
      "year": {
        "type": "number"
      }
    }
  }
}

두 인덱스 정의 중 하나는 genres 및 plot 필드를 텍스트로, year 필드를 숫자로 인덱싱합니다. $text 인덱스를 만드는 방법에 대한 지침은 텍스트 인덱스 만들기 를 참조하세요. Atlas Search 인덱스를 만드는 방법에 대한 지침은 Atlas Search 인덱스 만들기를 참조하세요.

Atlas Search로 전체 텍스트 쿼리의 유연성 향상

$text기반 쿼리를 업데이트 하여 $search 을(를) 사용하도록 업데이트하여 유연성과 편의성을 높일 수 있습니다. 이 예시 에서는 샘플 데이터 의 sample_mflix.movies 컬렉션 을 쿼리 하여 plot 필드 에 'poet'라는 단어가 있는 항목을 연도별로 오름차순으로 정렬하여 조회 합니다.

이전 섹션에 나열된 인덱스 정의는 $search 의 유연성 향상 중 하나를 보여 줍니다. sample_mflix.movies 에 $text 인덱스 를 생성하려면 먼저 MongoDB 가 지원하는 샘플 데이터에서 기존 텍스트 인덱스 를 삭제 해야 합니다. 컬렉션 당 하나의 텍스트 인덱스 만 사용 가능합니다.

반면, 단일 collection에 대해 여러 개의 Atlas Search 검색 인덱스 를 생성하여 애플리케이션에서 고유한 전체 텍스트 쿼리를 병렬로 활용할 수 있습니다.

다음 쿼리는 plot 필드에 'poet'이 포함된 가장 최근 영화 5편을 반환하고 제목, 장르, 줄거리 및 개봉 연도를 표시합니다.

Regex Index

Atlas Search 인덱스

db.movies.find(
   {
     $text: { $search: "poet" }
   },
   {
     _id: 0,
     title: 1,
     genres: 1,
     plot: 1,
     year: 1
   }
).limit(5)

db.movies.aggregate([
   {
     "$search": {
       "text": {
         "path": "plot",
         "query": "poet"
       }
     }
   },
   {
     "$limit": 5
   },
   {
     "$project": {
       "_id": 0,
       "title": 1,
       "genres": 1,
       "plot": 1,
       "year": 1,
     }
   }
])

이 두 쿼리 모두 다음과 같은 결과를 반환합니다.

{
 plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
 genres: [ 'Crime', 'Drama' ],
 title: 'Who Killed Pasolini?',
 year: 1995
},
{
 plot: 'Friendship and betrayal between two poets during the French Revolution.',
 genres: [ 'Biography', 'Drama' ],
 title: 'Pandaemonium',
 year: 2000
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 year: 2003,
 plot: 'Story of the relationship between the poets Ted Hughes and Sylvia Plath.',
 genres: [ 'Biography', 'Drama', 'Romance' ],
 title: 'Sylvia'
},
{
 plot: 'A love-struck Italian poet is stuck in Iraq at the onset of an American invasion.',
 genres: [ 'Comedy', 'Drama', 'Romance' ],
 title: 'The Tiger and the Snow',
 year: 2005
}

Atlas Search 고유의 기능으로 결과에 강조 표시 를 추가하여 결과가 발견된 컨텍스트에 따라 일치하는 항목을 표시할 수 있습니다. 이렇게 하려면 위의 Atlas Search 쿼리를 다음으로 바꿉니다.

1 db.movies.aggregate([
2   {
3     "$search": {
4       "text": {
5         "path": "plot",
6         "query": "poet"
7       },
8       "highlight": {
9         "path": "plot"
10       }
11     }
12   },
13   {
14     "$limit": 1
15   },
16   {
17     "$project": {
18       "_id": 0,
19       "title": 1,
20       "genres": 1,
21       "plot": 1,
22       "year": 1,
23       "highlights": { "$meta": "searchHighlights" }
24     }
25   }
26 ])

위 쿼리의 결과에는 모든 일치가 발생한 컨텍스트와 각각의 관련성 점수가 모두 포함된 highlights 필드가 포함됩니다. 예를 들어 다음은 $search 결과의 첫 번째 문서에 대한 highlights 필드를 보여줍니다.

{
  plot: `It's the story of the murder of a poet, a man, a great film director: Pier Paolo Pasolini. The story begin with the arrest of "Pelosi", a young man then accused of the murder of the poet. ...`,
  genres: [ 'Crime', 'Drama' ],
  title: 'Who Killed Pasolini?',
  year: 1995,
  highlights: [
    {
      score: 1.0902210474014282,
      path: 'plot',
      texts: [
        { value: "It's the story of the murder of a ", type: 'text' },
        { value: 'poet', type: 'hit' },
        {
          value: ', a man, a great film director: Pier Paolo Pasolini. ',
          type: 'text'
        }
      ]
    },
    {
      score: 1.0202842950820923,
      path: 'plot',
      texts: [
        {
          value: 'The story begin with the arrest of "Pelosi", a young man then accused of the murder of the ',
          type: 'text'
        },
        { value: 'poet', type: 'hit' },
        { value: '. ...', type: 'text' }
      ]
    }
  ]
}

Atlas Search를 사용하여 쿼리 성능 향상

Atlas Search 는 더 큰 유연성과 편의성 외에도 유사한 $text 쿼리에 비해 상당한 성능 이점을 제공합니다. sample_mflix.movies 컬렉션 에 대한 쿼리 를 사용하여 plot 필드 에 ' 조회 '이(가) 있고, 2000 ~ 2010 사이에 개봉한 희극 장르의 영화를 조회한다고 가정해 보겠습니다.

다음 쿼리를 실행합니다.

Text Index

Atlas Search 인덱스

db.movies.aggregate([
  {
    $match: {
      year: {$gte: 2000, $lte: 2010},
      $text: { $search: "poet" },
      genres : { $eq: "Comedy" }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    },
  }
])

db.movies.aggregate([
  {
    "$search": {
      "compound": {
        "filter": [{
          "range": {
            "gte": 2000,
            "lte": 2010,
            "path": "year"
          }
        },
        {
          "text": {
            "path": "plot",
            "query": "poet"
          }
        },
        {
          "text": {
            "path": "genres",
            "query": "comedy"
          }
        }]
      }
    }
  },
  { "$sort": { "year": 1 } },
  {
    "$limit": 3
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "genres": 1,
      "plot": 1,
      "year": 1
    }
  }
])

이 두 쿼리 모두 다음 세 가지 문서를 반환합니다.

   {
  year: 2000,
  plot: 'A film poem inspired by the Peruvian poet Cèsar Vallejo. A story about our need for love, our confusion, greatness and smallness and, most of all, our vulnerability. It is a story with many...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Songs from the Second Floor'
},
{
  plot: 'When his mother, who has sheltered him his entire 40 years, dies, Elling, a sensitive, would-be poet, is sent to live in a state institution. There he meets Kjell Bjarne, a gentle giant and...',
  genres: [ 'Comedy', 'Drama' ],
  title: 'Elling',
  year: 2001
},
{
  plot: 'Heart-broken after several affairs, a woman finds herself torn between a Poet and a TV Host.',
  genres: [ 'Comedy', 'Romance', 'Drama' ],
  title: 'Easy',
  year: 2003
}

$text 은(는) 이와 같이 간단하고 좁은 범위의 검색에 적합하지만, 데이터 세트의 크기와 쿼리의 폭이 증가함에 따라 $search 의 성능 이점으로 애플리케이션의 응답성이 크게 향상됩니다. $ 검색 집계 파이프라인 단계를 통해 Atlas Search 쿼리 를 사용하는 것이 좋습니다.

자세히 알아보기

Atlas Search 쿼리에 대해 자세히 알아보려면 Atlas Search 쿼리 만들기 및 실행을 참조하세요.
MongoDB University는 MongoDB 성능 최적화에 대한 무료 과정을 제공합니다. 자세한 내용은 모니터링 및 인사이트를 참조하세요.

돌아가기

동의어

정규 표현식 피하기

1	db.movies.aggregate([
2	{
3	"$search": {
4	"text": {
5	"path": "plot",
6	"query": "poet"
7	},
8	"highlight": {
9	"path": "plot"
10	}
11	}
12	},
13	{
14	"$limit": 1
15	},
16	{
17	"$project": {
18	"_id": 0,
19	"title": 1,
20	"genres": 1,
21	"plot": 1,
22	"year": 1,
23	"highlights": { "$meta": "searchHighlights" }
24	}
25	}
26	])