/ /

/ /

정규식 찾기(집계)

정의

$regexFindAll: 집계 표현식에서 정규식(regex) 패턴 일치 기능을 제공합니다. 연산자는 각 일치 항목에 대한 정보가 포함된 문서 배열을 반환합니다. 일치하는 항목을 찾지 못하면 빈 배열을 반환합니다.

구문

$regexFindAll 연산자의 구문은 다음과 같습니다.

{ $regexFindAll: { input: <expression> , regex: <expression>, options: <expression> } }

필드

설명

입력

정규식 패턴 적용 할 문자열입니다. 문자열 또는 문자열로 해석되는 모든 유효한 표현식 일 수 있습니다.

정규식

적용할 정규식 패턴입니다. 문자열 또는 정규식 패턴 /<pattern>/(으)로 해석되는 모든 유효한 표현식일 수 있습니다. 정규식 /<pattern>/ 사용 시, 정규식 옵션 i 및 m(s 또는 x 옵션은 제외)도 지정할 수 있습니다.

"pattern"
/<pattern>/
/<pattern>/<options>

또는 options 필드 사용하여 정규식 옵션을 지정할 수도 있습니다. s 또는 x 옵션을 지정하려면 options 필드 사용해야 합니다.

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

옵션

선택 사항입니다. 다음 <options>은 정규 표현식에 사용할 수 있습니다.

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

옵션	설명
`i`	대소문자 구분 없이 대문자와 소문자를 모두 일치시키기 위한 옵션입니다. `options` 필드에 옵션을 지정하거나 정규식 필드의 일부로 지정할 수 있습니다.
`m`	앵커가 포함된 패턴의 경우(즉, 시작은 `^`, 끝은 `$`), 여러 줄 값이 있는 문자열의 각 줄의 시작 또는 끝에서 일치시킵니다. 이 옵션이 없으면 이러한 앵커는 문자열의 시작 또는 끝에서 일치합니다. 패턴에 앵커가 포함되어 있지 않거나 문자열 값에 개행 문자가 없는 경우(예: `\n`), `m` 옵션은 효과가 없습니다.
`x`	이스케이프 처리되거나 문자 클래스에 포함되지 않는 한 패턴의 모든 공백 문자를 무시하는 "확장" 기능을 제공합니다. 또한 이스케이프되지 않은 해시/파운드 (`#`) 문자와 그 다음 개행까지의 문자를 무시하여 복잡한 패턴에 주석을 포함할 수 있습니다. 이 규칙은 데이터 문자에만 적용되며, 패턴의 특수 문자 시퀀스 내에는 공백 문자가 나타날 수 없습니다. `x` 옵션은 VT 문자 처리에 영향을 주지 않습니다(예: 코드 11). `options` 필드에서만 옵션을 지정할 수 있습니다.
`s`	점 문자(예 `.`) 개행 문자를 포함한 모든 문자와 일치합니다. `options` 필드에서만 옵션을 지정할 수 있습니다.

반환

연산자는 다음과 같이 배열을 반환합니다.

연산자가 일치하는 항목을 찾지 못하면 연산자는 빈 배열을 반환합니다.
연산자가 일치하는 항목을 찾으면 연산자는 각 일치 항목에 대해 다음 정보가 포함된 문서 배열을 반환합니다.
- input에서 일치하는 string,
- 입력에서 일치하는 문자열의 코드 점 인덱스 (바이트 인덱스 아님)입니다.
- 일치하는 문자열로 캡처된 그룹에 해당하는 문자열 배열입니다. 캡처 그룹은 정규식 패턴에서 이스케이프되지 않은 괄호 ()로 지정됩니다.
```
[ { "match" : <string>, "idx" : <num>, "captures" : <array of strings> }, ... ]
```

팁

행동

PCRE 라이브러리

버전 6.1부터 MongoDB PCRE2 (펄 (Perl) 호환 정규 표현식) 라이브러리를 사용하여 정규 표현식 패턴 일치를 구현 . PCRE 에2 대해자세히 학습 PCRE 설명서를 참조하세요.

$regexFindAll 및 데이터 정렬

$regexFindAll 에 대한 문자열 일치는 항상 대소문자를 구분하고 발음 부호를 구분합니다. $regexFindAll 은 컬렉션db.collection.aggregate() 및 인덱스(사용된 경우)에 지정된 데이터 정렬을 무시합니다.

예시 를 들어 1 데이터 정렬 강도로 컬렉션 만듭니다.

db.createCollection( "restaurants", { collation: { locale: "fr", strength: 1 } } )

다음 문서를 삽입합니다.

db.restaurants.insertMany( [
   { _id: 1, category: "café", status: "Open" },
   { _id: 2, category: "cafe", status: "open" },
   { _id: 3, category: "cafE", status: "open" }
] )

다음은 컬렉션의 데이터 정렬을 사용하여 대소문자를 구분하지 않고 분음 부호를 구분하지 않는 일치를 수행합니다.

db.restaurants.aggregate( [ { $match: { category: "cafe" } } ] )

[
   { _id: 1, category: 'café', status: 'Open' },
   { _id: 2, category: 'cafe', status: 'open' },
   { _id: 3, category: 'cafE', status: 'open' }
]

그러나 $regexFindAll 는 데이터 정렬을 무시합니다. 다음 정규 표현식 패턴 일치 예제는 대소문자를 구분하고 발음 부호를 구분합니다.

db.restaurants.aggregate( [
   {
      $addFields: {
         resultObject: { $regexFindAll: { input: "$category", regex: /cafe/ } }
      }
   }
] )
db.restaurants.aggregate( [
   {
      $addFields: {
         resultObject: { $regexFindAll: { input: "$category", regex: /cafe/ } }
      }
   }
],
   { collation: { locale: "fr", strength: 1 } } // Ignored in the $regexFindAll
)

두 연산 모두 다음과 같은 결과를 반환합니다.

{ "_id" : 1, "category" : "café", "resultObject" : null }
{ "_id" : 2, "category" : "cafe", "resultObject" : { "match" : "cafe", "idx" : 0, "captures" : [ ] } }
{ "_id" : 3, "category" : "cafE", "resultObject" : null }

쿼리 데이터 정렬을 무시하므로 category 문자열에서 정확히 일치해야 하며(대소문자 및 악센트 표시 포함), 이는 문서 _id: 2 만 일치됨을 의미합니다.

대소문자를 구분하지 않는 정규식 패턴 일치를 수행하려면 i 옵션을 대신 사용하세요. i 옵션을 예시로 참조하세요.

`captures` 출력 동작

정규식 패턴 에 캡처 그룹이 포함되어 있고 패턴 이 입력 에서 일치하는 항목을 찾는 경우 결과의 captures 배열 은 일치하는 string 에 의해 캡처된 그룹에 해당합니다. 캡처 그룹은 정규식 패턴 에서 이스케이프되지 않은 괄호 () 로 지정됩니다. captures 배열 의 길이는 패턴 의 캡처 그룹 수와 같고 배열 의 순서는 캡처 그룹이 나타나는 순서와 일치합니다.

다음 문서를 사용하여 contacts라는 이름의 샘플 collection을 생성합니다.

db.contacts.insertMany([
  { "_id": 1, "fname": "Carol", "lname": "Smith", "phone": "718-555-0113" },
  { "_id": 2, "fname": "Daryl", "lname": "Doe", "phone": "212-555-8832" },
  { "_id": 3, "fname": "Polly", "lname": "Andrews", "phone": "208-555-1932" },
  { "_id": 4, "fname": "Colleen", "lname": "Duncan", "phone": "775-555-0187" },
  { "_id": 5, "fname": "Luna", "lname": "Clarke", "phone": "917-555-4414" }
])

다음 파이프라인은 fname 필드에 정규식 패턴 /(C(ar)*)ol/을 다음과 같이 적용합니다.

db.contacts.aggregate([
  {
    $project: {
      returnObject: {
        $regexFindAll: { input: "$fname", regex: /(C(ar)*)ol/ }
      }
    }
  }
])

정규식 패턴은 fname 값 Carol 및 Colleen과 일치하는 항목을 찾습니다.

{ "_id" : 1, "returnObject" : [ { "match" : "Carol", "idx" : 0, "captures" : [ "Car", "ar" ] } ] }
{ "_id" : 2, "returnObject" : [ ] }
{ "_id" : 3, "returnObject" : [ ] }
{ "_id" : 4, "returnObject" : [ { "match" : "Col", "idx" : 0, "captures" : [ "C", null ] } ] }
{ "_id" : 5, "returnObject" : [ ] }

패턴에 중첩된 그룹 (ar)이 포함된 캡처 그룹 (C(ar)*) 이 포함되어 있습니다. captures 배열의 요소는 두 개의 캡처 그룹에 해당합니다. 일치하는 문서가 그룹에 의해 캡처되지 않은 경우(예: Colleen 및 그룹 (ar)), $regexFindAll는 그룹을 null 자리 표시자로 바꿉니다.

이전 예시에 표시된 것처럼 captures 배열에는 각 캡처 그룹에 대한 요소가 포함되어 있습니다(비캡처에는 null 사용). 다음 예제에서 phone 필드에 논리적 or의 캡처 그룹을 적용하여 뉴욕시 지역 번호가 있는 전화번호를 검색하는 경우를 살펴보세요. 각 그룹은 뉴욕시 지역 번호를 나타냅니다.

db.contacts.aggregate([
  {
    $project: {
      nycContacts: {
        $regexFindAll: { input: "$phone", regex: /^(718).*|^(212).*|^(917).*/ }
      }
    }
  }
])

정규식 패턴 과 일치하는 문서의 경우 captures 배열 은 일치하는 캡처 그룹 을 포함하고 캡처하지 않는 그룹을 null 로 바꿉니다.

{ "_id" : 1, "nycContacts" : [ { "match" : "718-555-0113", "idx" : 0, "captures" : [ "718", null, null ] } ] }
{ "_id" : 2, "nycContacts" : [ { "match" : "212-555-8832", "idx" : 0, "captures" : [ null, "212", null ] } ] }
{ "_id" : 3, "nycContacts" : [ ] }
{ "_id" : 4, "nycContacts" : [ ] }
{ "_id" : 5, "nycContacts" : [ { "match" : "917-555-4414", "idx" : 0, "captures" : [ null, null, "917" ] } ] }

예시

`$regexFindAll` 및 해당 옵션

이 예시에서 말한 $regexFindAll 연산자의 동작을 설명하기 위해 다음 문서를 사용하여 샘플 컬렉션 products를 만듭니다.

db.products.insertMany([
   { _id: 1, description: "Single LINE description." },
   { _id: 2, description: "First lines\nsecond line" },
   { _id: 3, description: "Many spaces before     line" },
   { _id: 4, description: "Multiple\nline descriptions" },
   { _id: 5, description: "anchors, links and hyperlinks" },
   { _id: 6, description: "métier work vocation" }
])

기본적으로 $regexFindAll는 대/소문자 구분 일치를 수행합니다. 예를 들어, 다음 집계는 description 필드에서 대소문자를 구분하는 $regexFindAll를 수행합니다. 정규식 패턴 /line/은 다음과 같이 그룹화를 지정하지 않습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /line/ } } } }
])

이 연산은 다음을 반환합니다:

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject" : [ ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ { "match" : "line", "idx" : 6, "captures" : [ ]}, { "match" : "line", "idx" : 19, "captures" : [ ] } ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ { "match" : "line", "idx" : 23, "captures" : [ ] } ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ { "match" : "line", "idx" : 9, "captures" : [ ] }
] }
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ ]
}
{
   "_id" : 6,
   "description" : "métier work vocation",
   "returnObject" : [ ]
}

다음 정규식 패턴 /lin(e|k)/ 은 패턴에서 그룹화 (e|k) 을 지정합니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /lin(e|k)/ } } } }
])

이 연산은 다음을 반환합니다:

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject": [ ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ { "match" : "line", "idx" : 6, "captures" : [ "e" ] }, { "match" : "line", "idx" : 19, "captures" : [ "e" ] } ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ { "match" : "line", "idx" : 23, "captures" : [ "e" ] } ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ { "match" : "line", "idx" : 9, "captures" : [ "e" ] } ]
}
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ { "match" : "link", "idx" : 9, "captures" : [ "k" ] }, { "match" : "link", "idx" : 24, "captures" : [ "k" ] } ]
}
{
   "_id" : 6,
   "description" : "métier work vocation",
   "returnObject" : [ ]
}

반환 옵션에서 idx 필드는 바이트 인덱스가 아닌 코드 점 인덱스입니다. 예를 들어, 정규식 패턴 /tier/을 사용하는 다음 예시 살펴보겠습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /tier/ } } } }
])

이 연산은 마지막 레코드만 패턴과 일치하고 반환된 idx가 2(바이트 인덱스를 사용하는 경우 3 대신)인 경우 다음을 반환합니다.

{ "_id" : 1, "description" : "Single LINE description.", "returnObject" : [ ] }
{ "_id" : 2, "description" : "First lines\nsecond line", "returnObject" : [ ] }
{ "_id" : 3, "description" : "Many spaces before     line", "returnObject" : [ ] }
{ "_id" : 4, "description" : "Multiple\nline descriptions", "returnObject" : [ ] }
{ "_id" : 5, "description" : "anchors, links and hyperlinks", "returnObject" : [ ] }
{ "_id" : 6, "description" : "métier work vocation",
             "returnObject" : [ { "match" : "tier", "idx" : 2, "captures" : [ ] } ] }

`i` 옵션

참고

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

대소문자를 구분하지 않는 패턴 일치를 수행하려면 i 옵션을 regex 필드의 일부로 포함하거나 options 필드에 포함합니다.

// Specify i as part of the regex field
{ $regexFindAll: { input: "$description", regex: /line/i } }
// Specify i in the options field
{ $regexFindAll: { input: "$description", regex: /line/, options: "i" } }
{ $regexFindAll: { input: "$description", regex: "line", options: "i" } }

예를 들어, 다음 집계는 description 필드에서 대소문자를 구분하지 않는 $regexFindAll를 수행합니다. 정규식 패턴 /line/은 다음과 같이 그룹화를 지정하지 않습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /line/i } } } }
])

이 작업은 다음 문서를 반환합니다.

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject" : [ { "match" : "LINE", "idx" : 7, "captures" : [ ] } ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ { "match" : "line", "idx" : 6, "captures" : [ ] }, { "match" : "line", "idx" : 19, "captures" : [ ] } ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ { "match" : "line", "idx" : 23, "captures" : [ ] } ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ { "match" : "line", "idx" : 9, "captures" : [ ] } ]
}
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ ]
}
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : [ ] }

`m` 옵션

참고

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

지정된 앵커와 일치시키려면(예: ^, $) 여러 줄 문자열의 각 줄에 대해 m 옵션을 regex 필드의 일부로 포함하거나 options 필드에 포함해야 합니다.

// Specify m as part of the regex field
{ $regexFindAll: { input: "$description", regex: /line/m } }
// Specify m in the options field
{ $regexFindAll: { input: "$description", regex: /line/, options: "m" } }
{ $regexFindAll: { input: "$description", regex: "line", options: "m" } }

다음 예제에는 여러 줄 문자열의 경우 문자 s 또는 S로 시작하는 줄을 일치시키는 i 및 m 옵션이 모두 포함되어 있습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /^s/im } } } }
])

이 연산은 다음을 반환합니다:

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject" : [ { "match" : "S", "idx" : 0, "captures" : [ ] } ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ { "match" : "s", "idx" : 12, "captures" : [ ] } ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ ]
}
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ ]
}
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : [ ] }

`x` 옵션

참고

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

패턴에서 이스케이프되지 않은 모든 공백 문자와 주석(이스케이프되지 않은 해시 # 문자와 다음 줄 바꿈 문자로 표시됨)을 무시하려면 옵션 필드에 s 옵션을 포함합니다.

// Specify x in the options field
{ $regexFindAll: { input: "$description", regex: /line/, options: "x" } }
{ $regexFindAll: { input: "$description", regex: "line", options: "x" } }

다음 예시에는 이스케이프되지 않은 공백과 주석을 건너뛰는 x 옵션이 포함되어 있습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex: /lin(e|k) # matches line or link/, options:"x" } } } }
])

이 연산은 다음을 반환합니다:

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject" : [ ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ { "match" : "line", "idx" : 6, "captures" : [ "e" ] }, { "match" : "line", "idx" : 19, "captures" : [ "e" ] } ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ { "match" : "line", "idx" : 23, "captures" : [ "e" ] } ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ { "match" : "line", "idx" : 9, "captures" : [ "e" ] } ]
}
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ { "match" : "link", "idx" : 9, "captures" : [ "k" ] }, { "match" : "link", "idx" : 24, "captures" : [ "k" ] } ]
}
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : [ ] }

`s` 옵션

참고

regex 및 options 필드 모두에 옵션을 지정할 수 없습니다.

패턴의 점 문자(예: .)가 새 줄 문자를 포함한 모든 문자와 일치하도록 하려면 options 필드에 s 옵션을 포함합니다.

// Specify s in the options field
{ $regexFindAll: { input: "$description", regex: /m.*line/, options: "s" } }
{ $regexFindAll: { input: "$description", regex: "m.*line", options: "s" } }

다음 예에는 점 문자(예시:.)가 새 줄을 포함한 모든 문자와 일치하도록 허용하는 s 옵션과 대소문자를 구분하지 않는 일치를 수행하는 i 옵션이 포함되어 있습니다.

db.products.aggregate([
   { $addFields: { returnObject: { $regexFindAll: { input: "$description", regex:/m.*line/, options: "si"  } } } }
])

이 연산은 다음을 반환합니다:

{
   "_id" : 1,
   "description" : "Single LINE description.",
   "returnObject" : [ ]
}
{
   "_id" : 2,
   "description" : "First lines\nsecond line",
   "returnObject" : [ ]
}
{
   "_id" : 3,
   "description" : "Many spaces before     line",
   "returnObject" : [ { "match" : "Many spaces before line", "idx" : 0, "captures" : [ ] } ]
}
{
   "_id" : 4,
   "description" : "Multiple\nline descriptions",
   "returnObject" : [ { "match" : "Multiple\nline", "idx" : 0, "captures" : [ ] } ]
}
{
   "_id" : 5,
   "description" : "anchors, links and hyperlinks",
   "returnObject" : [ ]
}
{ "_id" : 6, "description" : "métier work vocation", "returnObject" : [ ] }

를 `$regexFindAll` 사용하여 문자열에서 이메일 구문 분석 string

다음 문서를 사용하여 샘플 collection feedback 을 만듭니다.

db.feedback.insertMany([
   { "_id" : 1, comment: "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com"  },
   { "_id" : 2, comment: "I wanted to concatenate a string" },
   { "_id" : 3, comment: "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com" },
   { "_id" : 4, comment: "It's just me. I'm testing.  fred@MongoDB.com" }
])

다음 집계 에서는 $regexFindAll 를 사용하여 comment 필드 에서 모든 이메일을 추출합니다(대소문자 구분 없음).

db.feedback.aggregate( [
    { $addFields: {
       "email": { $regexFindAll: { input: "$comment", regex: /[a-z0-9_.+-]+@[a-z0-9_.+-]+\.[a-z0-9_.+-]+/i } }
    } },
    { $set: { email: "$email.match"} }
] )

첫 번째 단계

이 단계에서는 $addFields 단계를 사용하여 문서 에 새 필드 email 을 추가합니다. 새 필드 는 comment 필드 에 $regexFindAll 을 수행한 결과를 포함하는 배열 입니다.

{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : [ { "match" : "aunt.arc.tica@example.com", "idx" : 38, "captures" : [ ] } ] }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "email" : [ ] }
{ "_id" : 3, "comment" : "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com", "email" : [ { "match" : "cam@mongodb.com", "idx" : 56, "captures" : [ ] }, { "match" : "c.dia@mongodb.com", "idx" : 75, "captures" : [ ] } ] }
{ "_id" : 4, "comment" : "It's just me. I'm testing.  fred@MongoDB.com", "email" : [ { "match" : "fred@MongoDB.com", "idx" : 28, "captures" : [ ] } ] }

두 번째 단계

이 단계에서는 $set 단계를 사용하여 email 배열 요소를 "email.match" 값으로 재설정합니다. email의 현재 값이 null이면 email의 새로운 값도 null로 설정됩니다.

{ "_id" : 1, "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com", "email" : [ "aunt.arc.tica@example.com" ] }
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "email" : [ ] }
{ "_id" : 3, "comment" : "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com", "email" : [ "cam@mongodb.com", "c.dia@mongodb.com" ] }
{ "_id" : 4, "comment" : "It's just me. I'm testing.  fred@MongoDB.com", "email" : [ "fred@MongoDB.com" ] }

캡처한 그룹을 사용하여 사용자 이름 구문 분석

다음 문서를 사용하여 샘플 collection feedback 을 만듭니다.

db.feedback.insertMany([
   { "_id" : 1, comment: "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com"  },
   { "_id" : 2, comment: "I wanted to concatenate a string" },
   { "_id" : 3, comment: "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com" },
   { "_id" : 4, comment: "It's just me. I'm testing.  fred@MongoDB.com" }
])

피드백에 회신하기 위해 인사말에 이름으로 사용할 이메일 주소의 로컬 부분을 구문 분석한다고 가정해 보겠습니다. $regexFindAll 결과에서 반환된 captured 필드를 사용해 각 이메일 주소의 로컬 부분을 구문 분석할 수 있습니다.

db.feedback.aggregate( [
    { $addFields: {
       "names": { $regexFindAll: { input: "$comment", regex: /([a-z0-9_.+-]+)@[a-z0-9_.+-]+\.[a-z0-9_.+-]+/i } },
    } },
    { $set: { names: { $reduce: { input:  "$names.captures", initialValue: [ ], in: { $concatArrays: [ "$$value", "$$this" ] } } } } }
] )

첫 번째 단계

이 단계에서는 $addFields 단계를 사용하여 문서에 새 필드 names 을 추가합니다. 새 필드는 comment 필드에 $regexFindAll를 수행한 결과를 포함합니다.

{
   "_id" : 1,
   "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com",
   "names" : [ { "match" : "aunt.arc.tica@example.com", "idx" : 38, "captures" : [ "aunt.arc.tica" ] } ]
}
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "names" : [ ] }
{
   "_id" : 3,
   "comment" : "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com",
   "names" : [
      { "match" : "cam@mongodb.com", "idx" : 56, "captures" : [ "cam" ] },
      { "match" : "c.dia@mongodb.com", "idx" : 75, "captures" : [ "c.dia" ] }
    ]
}
{
   "_id" : 4,
   "comment" : "It's just me. I'm testing.  fred@MongoDB.com",
   "names" : [ { "match" : "fred@MongoDB.com", "idx" : 28, "captures" : [ "fred" ] } ]
}

두 번째 단계

이 단계에서는 $set 단계를 $reduce 연산자와 함께 사용하여 "$names.captures" 요소를 포함하는 배열로 names을(를) 재설정합니다.

{
   "_id" : 1,
   "comment" : "Hi, I'm just reading about MongoDB -- aunt.arc.tica@example.com",
   "names" : [ "aunt.arc.tica" ]
}
{ "_id" : 2, "comment" : "I wanted to concatenate a string", "names" : [ ] }
{
   "_id" : 3,
   "comment" : "How do I convert a date to string? Contact me at either cam@mongodb.com or c.dia@mongodb.com",
   "names" : [ "cam", "c.dia" ]
}
{
   "_id" : 4,
   "comment" : "It's just me. I'm testing.  fred@MongoDB.com",
   "names" : [ "fred" ]
}

팁

captures 배열 의 동작에 대한 자세한 내용과 추가 예제는 captures 출력 동작을 참조하세요.

돌아가기

$regexFind

$regexMatch

정의

구문

반환

팁

행동

PCRE 라이브러리

$regexFindAll 및 데이터 정렬

captures 출력 동작

예시

$regexFindAll 및 해당 옵션

i 옵션

참고

m 옵션

참고

x 옵션

참고

s 옵션

참고

를 $regexFindAll 사용하여 문자열에서 이메일 구문 분석 string

캡처한 그룹을 사용하여 사용자 이름 구문 분석

팁

`captures` 출력 동작

`$regexFindAll` 및 해당 옵션

`i` 옵션

`m` 옵션

`x` 옵션

`s` 옵션

를 `$regexFindAll` 사용하여 문자열에서 이메일 구문 분석 string