MongoDB Atlas의 데이터 애그리게이션 - 함수

이 페이지의 내용

데이터 모델

스니펫 설정
집계 파이프라인실행
Atlas Search로 데이터 찾기
집계 단계
문서 필터링
문서 그룹화
문서 필드 프로젝트
문서에 필드 추가
배열 값 풀기
애그리게이션 프레임워크 제한 사항

이 페이지의 예제는 Atlas Function 에서 MongoDB Query API 를 사용하여 Atlas cluster 에서 문서를 집계하는 방법을 보여줍니다.

MongoDB 집계 파이프라인 은 문서를 필터하다 및 구성하고 관련 문서 그룹에 대한 요약 데이터를 수집할 수 있는 일련의 데이터 집계 단계 를 통해 컬렉션 의 모든 문서를 실행 합니다.

참고

지원되는 집계 단계

Atlas Function 지원 거의 모든 MongoDB 집계 파이프라인 단계 및 연산자를 지원하지만, 일부 단계와 연산자는 시스템 기능 내에서 실행해야 합니다. 자세한 내용은 애그리게이션 프레임워크 제한 사항 을 참조하세요.

데이터 모델

이 페이지의 예에서는 온라인 스토어의 과거 품목 판매에 대한 정보가 포함된 store.purchases컬렉션을 사용합니다. 각 문서에는 아이템 name, 구매한 quantity 등 구매한 items 목록과 해당 아이템을 구매한 고객의 고유 ID 값이 포함되어 있습니다.

{
  "title": "Purchase",
  "required": ["_id", "customerId", "items"],
  "properties": {
    "_id": { "bsonType": "objectId" },
    "customerId": { "bsonType": "objectId" },
    "items": {
      "bsonType": "array",
      "items": {
        "bsonType": "object",
        "required": ["name", "quantity"],
        "properties": {
          "name": { "bsonType": "string" },
          "quantity": { "bsonType": "int" }
        }
      }
    }
  }
}

스니펫 설정

함수에서 코드 스니펫을 사용하려면 먼저 MongoDB 컬렉션 핸들을 인스턴스화해야 합니다:

exports = function() {
  const mongodb = context.services.get("mongodb-atlas");
  const itemsCollection = mongodb.db("store").collection("items");
  const purchasesCollection = mongodb.db("store").collection("purchases");
  // ... paste snippet here ...
}

집계 파이프라인실행

collection.aggregate() 메서드를 사용하여 집계 파이프라인을 실행할 수 있습니다.

다음 함수 스니펫은 purchases 컬렉션의 모든 문서를 customerId 값에 따라 그룹화하고 각 고객이 구매한 품목 수와 총 구매 횟수를 집계합니다. 문서를 그룹화한 후 파이프라인은 고객이 한 번에 구매하는 평균 품목 수를 계산하는 새 필드averageNumItemsPurchased를 각 고객의 문서에 추가합니다.

const pipeline = [
  { "$group": {
      "_id": "$customerId",
      "numPurchases": { "$sum": 1 },
      "numItemsPurchased": { "$sum": { "$size": "$items" } }
  } },
  { "$addFields": {
      "averageNumItemsPurchased": {
        "$divide": ["$numItemsPurchased", "$numPurchases"]
      }
  } }
]
return purchasesCollection.aggregate(pipeline).toArray()
  .then(customers => {
    console.log(`Successfully grouped purchases for ${customers.length} customers.`)
    for(const customer of customers) {
      console.log(`customer: ${customer._id}`)
      console.log(`num purchases: ${customer.numPurchases}`)
      console.log(`total items purchased: ${customer.numItemsPurchased}`)
      console.log(`average items per purchase: ${customer.averageNumItemsPurchased}`)
    }
    return customers
  })
  .catch(err => console.error(`Failed to group purchases by customer: ${err}`))

Atlas Search로 데이터 찾기

다음을 실행 수 있습니다. |service| 및 집계 단계가 있는 컬렉션 에 대한 검색 collection.aggregate() 쿼리입니다.$search

중요

Atlas Function은 시스템 사용자 로 $search 작업을 수행하고 반환된 검색 결과에 필드 수준 규칙을 시행하다 합니다. 즉, 사용자가 읽기 액세스 이 없는 필드 에서 검색 할 수 있습니다. 이 경우 검색 은 지정된 필드 를 기반으로 하지만 반환된 문서에는 해당 필드 가 포함되지 않습니다.

exports = async function searchMoviesAboutBaseball() {
  // 1. Get a reference to the collection you want to search.
  const movies = context.services
    .get("mongodb-atlas")
    .db("sample_mflix")
    .collection("movies");
  // 2. Run an aggregation with $search as the first stage.
  const baseballMovies = await movies
    .aggregate([
      {
        $search: {
          text: {
            query: "baseball",
            path: "plot",
          },
        },
      },
      {
        $limit: 5,
      },
      {
        $project: {
          _id: 0,
          title: 1,
          plot: 1,
        },
      },
    ])
    .toArray();
  return baseballMovies;
};

{
  "plot" : "A trio of guys try and make up for missed
  opportunities in childhood by forming a three-player
  baseball team to compete against standard children
  baseball squads.",
  "title" : "The Benchwarmers"
}
{
  "plot" : "A young boy is bequeathed the ownership of a
  professional baseball team.",
  "title" : "Little Big League"
}
{
  "plot" : "A trained chimpanzee plays third base for a
  minor-league baseball team.",
  "title" : "Ed"
}
{
  "plot" : "The story of the life and career of the famed
  baseball player, Lou Gehrig.",
  "title" : "The Pride of the Yankees"
}
{
  "plot" : "Babe Ruth becomes a baseball legend but is
  unheroic to those who know him.",
  "title" : "The Babe"
}

참고

$$SEARCH_META 변수 가용성

$$SEARCH_META 집계 변수는 시스템으로 실행 되는 함수 또는 검색된 컬렉션 의 첫 번째 역할 에 apply_when 및 read 표현식이 true 로 설정하다 경우에만 사용할 수 있습니다.

이 두 가지 시나리오 중 어느 것도 적용되지 않으면 $$SEARCH_META가 정의되지 않으며 집계가 실패합니다.

집계 단계

문서 필터링

$match 단계를 사용하면 표준 MongoDB 쿼리 구문을 사용하여 수신 문서를 필터하다 할 수 있습니다.

{
  "$match": {
    "<Field Name>": <Query Expression>,
    ...
  }
}

예시

다음 $match 단계는 수신 문서를 필터링하여 graduation_year 필드의 값이 2019~2024인 문서만 포함합니다.

{
  "$match": {
    "graduation_year": {
      "$gte": 2019,
      "$lte": 2024
    },
  }
}

문서 그룹화

$ 그룹 단계를 사용하여 하나 이상의 문서 그룹에 대한 요약 데이터를 집계할 수 있습니다. MongoDB 는 _id 표현식 을 기반으로 문서를 그룹화합니다.

참고

필드 이름 앞에 $을 붙여 특정 문서 필드를 참조할 수 있습니다.

{
  "$group": {
    "_id": <Group By Expression>,
    "<Field Name>": <Aggregation Expression>,
    ...
  }
}

예시

다음 $group 단계에서는 customerId 필드의 값을 기준으로 문서를 그룹화하고 각 customerId별로 표시되는 구매 문서 수를 계산합니다.

{
  "$group": {
    "_id": "$customerId",
    "numPurchases": { "$sum": 1 }
  }
}

문서 필드 프로젝트

$ 프로젝트 단계를 사용하여 문서의 특정 필드를 포함 또는 생략하거나 집계 연산자 를 사용하여 새 필드를 계산할 수 있습니다. 필드 를 포함하려면 해당 값을 1 로 설정하다 합니다. 필드 를 생략하려면 해당 값을 0 로 설정하다 합니다.

참고

_id 이외의 필드를 동시에 생략하고 포함할 수 없습니다. _id 이외의 필드 를 명시적으로 포함하는 경우 명시적으로 포함하지 않은 모든 필드는 자동으로 생략됩니다(또는 그 반대의 경우도 마찬가지).

{
  "$project": {
    "<Field Name>": <0 | 1 | Expression>,
    ...
  }
}

예시

다음 $project 단계에서는 _id 필드를 생략하고 customerId 필드를 포함하며, numItems라는 새 필드를 만듭니다. 여기서 값은 items 배열의 문서 수입니다.

{
  "$project": {
    "_id": 0,
    "customerId": 1,
    "numItems": { "$sum": { "$size": "$items" } }
  }
}

문서에 필드 추가

$addFields 단계에서 애그리게이션 연산자를 사용하여 계산된 값이 있는 새 필드를 추가할 수 있습니다.

참고

$addFields 는 $project와 유사하지만 필드를 포함하거나 생략할 수 없습니다.

예시

다음 $addFields 단계에서는 값이 items 배열의 문서 수인 numItems라는 새 필드를 생성합니다.

{
  "$addFields": {
    "numItems": { "$sum": { "$size": "$items" } }
  }
}

배열 값 풀기

$unwind 단계를 사용하여 배열 필드의 개별 요소를 집계할 수 있습니다. 배열 필드 를 MongoDB 는 배열 필드 의 각 요소에 대해 각 문서 를 한 번씩 복사하지만 배열 값을 각 사본의 배열 요소로 바꿉니다.

{
  $unwind: {
    path: <Array Field Path>,
    includeArrayIndex: <string>,
    preserveNullAndEmptyArrays: <boolean>
  }
}

예시

다음 $unwind 단계에서는 각 문서에서 items 배열의 각 요소에 대해 새 문서를 만듭니다. 또한 각 새 문서에 itemIndex라는 필드를 추가하여 원본 배열에서 요소의 위치 인덱스를 지정합니다:

{
  "$unwind": {
    "path": "$items",
    "includeArrayIndex": "itemIndex"
  }
}

purchases 컬렉션에 있는 다음 문서를 고려하세요.

{
  _id: 123,
  customerId: 24601,
  items: [
    { name: "Baseball", quantity: 5 },
    { name: "Baseball Mitt", quantity: 1 },
    { name: "Baseball Bat", quantity: 1 },
  ]
}

이 문서에 예시 $unwind 단계를 적용하면 단계는 다음 세 가지 문서를 출력합니다.

{
  _id: 123,
  customerId: 24601,
  itemIndex: 0,
  items: { name: "Baseball", quantity: 5 }
}, {
  _id: 123,
  customerId: 24601,
  itemIndex: 1,
  items: { name: "Baseball Mitt", quantity: 1 }
}, {
  _id: 123,
  customerId: 24601,
  itemIndex: 2,
  items: { name: "Baseball Bat", quantity: 1 }
}

애그리게이션 프레임워크 제한 사항

집계 메서드

Atlas Function은 다음 명령을 사용하여 데이터베이스 및 컬렉션 수준 모두에서 집계 을 지원 합니다.

집계 파이프라인 가용성

시스템 사용자 는 $indexStats 을 제외한 모든 집계 파이프라인 단계 를 사용할 수 있습니다.

집계 파이프라인 연산자 가용성

Atlas Function은 시스템 사용자 컨텍스트에서 집계 파이프라인 을 실행 때 모든 집계 파이프라인 연산자 를 지원 합니다.

돌아가기

쓰기

비밀 정의 및 관리