대량 쓰기 작업

개요

이 가이드 에서는 PyMongo 사용하여 대량 작업을 수행하는 방법을 학습 수 있습니다. 대량 작업은 단일 메서드로 여러 쓰기 (write) 작업을 수행하여 서버 에 대한 호출 수를 줄입니다.

Collection 및 MongoClient 클래스는 모두 bulk_write() 메서드를 제공합니다. Collection 인스턴스 에서 bulk_write() 를 호출하면 단일 컬렉션 에 대해 여러 쓰기 (write) 작업을 수행할 수 있습니다. MongoClient 인스턴스 에서 bulk_write() 를 호출하면 여러 네임스페이스에 걸쳐 대량 쓰기를 수행할 수 있습니다. MongoDB 에서 네임스페이스 <database>.<collection> 형식의 데이터베이스 이름과 컬렉션 이름으로 구성됩니다.

중요

MongoClient 인스턴스 에서 대량 작업을 수행하려면 애플리케이션 다음 요구 사항을 충족하는지 확인하세요.

PyMongo v4.9 이상 버전 사용
MongoDB Server v8.0 이상에 연결합니다.

샘플 데이터

이 가이드 의 예제에서는 sample_restaurants.restaurants sample_mflix.movies Atlas 샘플 데이터 세트의 및 컬렉션을 사용합니다. 무료 MongoDB Atlas cluster 생성하고 샘플 데이터 세트를 로드하는 방법을 학습 PyMongo 시작하기 튜토리얼을 참조하세요.

쓰기 작업 정의

수행하려는 각 쓰기 작업에 대해 다음 작업 클래스 중 하나의 인스턴스를 만듭니다.

InsertOne
UpdateOne
UpdateMany
ReplaceOne
DeleteOne
DeleteMany

그런 다음 이러한 인스턴스 목록을 bulk_write() 메서드에 전달합니다.

중요

다음 코드에 표시된 대로 쓰기 (write) 작업 클래스를 애플리케이션 파일 로 가져와야 합니다.

from pymongo import InsertOne, UpdateOne, UpdateMany, ReplaceOne, DeleteOne, DeleteMany

다음 섹션에서는 컬렉션 및 클라이언트 대량 작업을 수행하는 데 사용할 수 있는 이전 클래스의 인스턴스를 만드는 방법을 보여줍니다.

삽입 작업

삽입 작업을 수행하려면 InsertOne 인스턴스 를 만들고 삽입하려는 문서 지정합니다. 다음 키워드 인수를 InsertOne 생성자에 전달합니다.

namespace: 문서 삽입할 네임스페이스 입니다. 단일 컬렉션 에 대해 대량 작업을 수행하는 경우 이 인수는 선택 사항입니다.
document: 삽입할 문서 입니다.

다음 예에서는 InsertOne 인스턴스를 만듭니다.

operation = InsertOne(
    namespace="sample_restaurants.restaurants",
    document={
        "name": "Mongo's Deli",
        "cuisine": "Sandwiches",
        "borough": "Manhattan",
        "restaurant_id": "1234"
    }
)

사용자 지정 클래스의 인스턴스 생성자에 전달하여 InsertOne 의 인스턴스 를 만들 수도 있습니다. 이는 유형 검사 도구를 사용하는 경우 추가 유형 안전성을 제공합니다. 전달하는 인스턴스 TypedDict 클래스에서 상속되어야 합니다.

참고

Python 3.7 및 이전 버전의 TypedDict

TypedDict typing 3클래스는 모듈에 있으며,8 이는 Python. 이상에서만 사용할 TypedDict 수 있습니다. 이전 버전의 Python 에서 클래스를 사용하려면 타이핑_확장 패키지 설치합니다.

다음 예시 유형 안전성을 강화하기 위해 사용자 지정 클래스를 사용하여 InsertOne 인스턴스 구성합니다.

class Restaurant (TypedDict):
    name: str
    cuisine: str
    borough: str
    restaurant_id: str
operation = pymongo.InsertOne(Restaurant(
    name="Mongo's Deli", cuisine="Sandwiches", borough="Manhattan", restaurant_id="1234"))

여러 문서를 삽입하려면 각 문서 에 대해 InsertOne 인스턴스 를 만듭니다.

참고

_id 필드는 고유해야 합니다.

MongoDB 컬렉션 에서 각 문서 에는 고유한 값을 가진 _id 필드 포함되어야 합니다.

_id 필드 에 값을 지정하는 경우 해당 값이 컬렉션 전체에서 고유한지 확인해야 합니다. 값을 지정하지 않으면 운전자 필드 에 대해 고유한 ObjectId 값을 자동으로 생성합니다.

고유성을 보장하기 위해 운전자 _id 값을 자동으로 생성하도록 하는 것이 좋습니다. 중복된 _id 값은 고유 인덱스 제약 조건을 위반하여 운전자 오류를 반환합니다.

업데이트 작업

문서를 업데이트하려면 UpdateOne 인스턴스를 만들고 다음 인수를 전달합니다.

namespace: 업데이트 를 수행할 네임스페이스 입니다. 단일 컬렉션 에 대해 대량 작업을 수행하는 경우 이 인수는 선택 사항입니다.
filter컬렉션 의 문서를 일치시키는 데 사용되는 기준을 지정하는 쿼리 필터하다 입니다.
update: 수행하려는 업데이트 입니다. 업데이트 작업에 대한 자세한 내용은 MongoDB Server 매뉴얼의 필드 업데이트 연산자 가이드 참조하세요.

UpdateOne 쿼리 필터와 일치 하는 첫 번째 문서를 업데이트합니다 .

다음 예에서는 UpdateOne 인스턴스를 만듭니다.

operation = UpdateOne(
    namespace="sample_restaurants.restaurants",
    filter={ "name": "Mongo's Deli" },
    update={ "$set": { "cuisine": "Sandwiches and Salads" }}
)

여러 문서를 업데이트 하려면 UpdateMany 인스턴스 를 만들고 동일한 인수를 전달합니다. UpdateMany 은 쿼리 필터하다 와 일치하는 모든 문서를 업데이트합니다.

다음 예에서는 UpdateMany 인스턴스를 만듭니다.

operation = UpdateMany(
    namespace="sample_restaurants.restaurants",
    filter={ "name": "Mongo's Deli" },
    update={ "$set": { "cuisine": "Sandwiches and Salads" }}
)

대체 작업

바꾸기 작업은 지정된 문서 의 모든 필드와 값을 제거하고 새 항목으로 바꿉니다. 바꾸기 작업을 수행하려면 ReplaceOne 인스턴스 를 만들고 다음 인수를 전달합니다.

namespace: 바꾸기 작업을 수행할 네임스페이스 입니다. 단일 컬렉션 에 대해 대량 작업을 수행하는 경우 이 인수는 선택 사항입니다.
filter: 바꿀 문서 일치시키는 데 사용되는 기준을 지정하는 쿼리 필터하다 입니다.
replacement: 일치하는 문서 에 저장 하려는 새 필드와 값이 포함된 문서 입니다.

다음 예에서는 ReplaceOne 인스턴스를 만듭니다.

operation = ReplaceOne(
    namespace="sample_restaurants.restaurants",
    filter={ "restaurant_id": "1234" },
    replacement={
        "name": "Mongo's Pizza",
        "cuisine": "Pizza",
        "borough": "Brooklyn",
        "restaurant_id": "5678"
    }
)

사용자 지정 클래스의 인스턴스 생성자에 전달하여 ReplaceOne 의 인스턴스 를 만들 수도 있습니다. 이는 유형 검사 도구를 사용하는 경우 추가 유형 안전성을 제공합니다. 전달하는 인스턴스 TypedDict 클래스에서 상속되어야 합니다.

참고

Python 3.7 및 이전 버전의 TypedDict

다음 예시 유형 안전성을 강화하기 위해 사용자 지정 클래스를 사용하여 ReplaceOne 인스턴스 구성합니다.

class Restaurant (TypedDict):
    name: str
    cuisine: str
    borough: str
    restaurant_id: str
operation = pymongo.ReplaceOne(
    { "restaurant_id": "1234" },
    Restaurant(name="Mongo's Pizza", cuisine="Pizza", borough="Brooklyn", restaurant_id="5678")
)

여러 문서를 바꾸려면 각 문서 에 대해 ReplaceOne 인스턴스 를 만들어야 합니다.

팁

유형 검사 도구

Python 에 사용할 수 있는 유형 검사 도구에 대해 자세히 학습 도구 페이지의 유형 검사기를 참조하세요.

삭제 작업

문서 삭제 하려면 DeleteOne 인스턴스 를 만들고 다음 인수를 전달합니다.

namespace: 문서 삭제 네임스페이스 입니다. 단일 컬렉션 에 대해 대량 작업을 수행하는 경우 이 인수는 선택 사항입니다.
filter: 삭제 문서 일치시키는 데 사용되는 기준을 지정하는 쿼리 필터하다 입니다.

DeleteOne 쿼리 필터하다 와 일치하는 첫 번째 문서 만 제거합니다.

다음 예에서는 DeleteOne 인스턴스를 만듭니다.

operation = DeleteOne(
    namespace="sample_restaurants.restaurants",
    filter={ "restaurant_id": "5678" }
)

여러 문서를 삭제 하려면 DeleteMany 인스턴스 를 만들고 삭제 하려는 문서 지정하는 네임스페이스 및 쿼리 필터하다 전달합니다. DeleteMany 은 쿼리 필터하다 와 일치하는 모든 문서를 제거합니다.

다음 예에서는 DeleteMany 인스턴스를 만듭니다.

operation = DeleteMany(
    namespace="sample_restaurants.restaurants",
    filter={ "name": "Mongo's Deli" }
)

bulk_write() 메서드 호출

수행하려는 각 작업에 대한 클래스 인스턴스 를 정의한 후 이러한 인스턴스 목록을 bulk_write() 메서드에 전달합니다. 단일 컬렉션 에 쓰기 (write) Collection 인스턴스 에서 bulk_write() 메서드를 호출하고, 여러 네임스페이스에 쓰기 (write) MongoClient 인스턴스 호출합니다.

Collection 에서 호출된 쓰기 (write) 작업 중 하나라도 실패하면 PyMongo BulkWriteError 를 발생시키고 추가 작업을 수행하지 않습니다. BulkWriteError 는 실패한 작업과 예외에 대한 세부 정보가 포함된 details 속성을 제공합니다.

MongoClient 에서 호출된 쓰기 (write) 작업 중 하나라도 실패하면 PyMongo ClientBulkWriteException 를 발생시키고 추가 작업을 수행하지 않습니다. ClientBulkWriteException 는 예외에 대한 정보를 포함하는 error 속성을 제공합니다.

참고

PyMongo 대량 작업을 실행할 때 작업이 실행 중인 컬렉션 또는 클라이언트 의 write_concern 를 사용합니다. MongoClient.bulk_write() 메서드를 사용할 때 작업에 대한 쓰기 고려 (write concern) 설정하다 수도 있습니다. 운전자 실행 순서에 관계없이 모든 작업을 시도한 후 모든 쓰기 고려 (write concern) 오류를 보고합니다.

쓰기 (write) 고려에 대해 자세히 학습 MongoDB Server 매뉴얼에서 쓰기 고려를 참조하세요.

컬렉션 대량 쓰기 예시

다음 예시 Collection 인스턴스 에서 bulk_write() 메서드를 사용하여 restaurants 컬렉션 에 대해 여러 쓰기 (write) 작업을 수행합니다.

operations = [
    InsertOne(
        document={
            "name": "Mongo's Deli",
            "cuisine": "Sandwiches",
            "borough": "Manhattan",
            "restaurant_id": "1234"
        }
    ),
    InsertOne(
        document={
            "name": "Mongo's Deli",
            "cuisine": "Sandwiches",
            "borough": "Brooklyn",
            "restaurant_id": "5678"
        }
    ),
    UpdateMany(
        filter={ "name": "Mongo's Deli" },
        update={ "$set": { "cuisine": "Sandwiches and Salads" }}
    ),
    DeleteOne(
        filter={ "restaurant_id": "1234" }
    )
]
results = restaurants.bulk_write(operations)
print(results)

BulkWriteResult({'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 2,
'nUpserted': 0, 'nMatched': 2, 'nModified': 2, 'nRemoved': 1, 'upserted': []},
acknowledged=True)

클라이언트 대량 쓰기 예시

다음 예시 MongoClient 인스턴스 에서 bulk_write() 메서드를 사용하여 sample_restaurants.restaurants 및 sample_mflix.movies 네임스페이스에 대해 여러 쓰기 (write) 작업을 수행합니다.

operations = [
    InsertOne(
        namespace="sample_mflix.movies",
        document={
            "title": "Minari",
            "runtime": 217,
            "genres": ["Drama", "Comedy"]
        }
    ),
    UpdateOne(
        namespace="sample_mflix.movies",
        filter={ "title": "Minari" },
        update={ "$set": { "runtime": 117 }}
    ),
    DeleteMany(
        namespace="sample_restaurants.restaurants",
        filter={ "cuisine": "French" }
    )
]
results = client.bulk_write(operations)
print(results)

ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],
'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1,
'nModified': 1, 'nDeleted': 344, 'insertResults': {}, 'updateResults': {},
'deleteResults': {}}, acknowledged=True, verbose=False)

대량 쓰기 작업 사용자 지정

bulk_write() 메서드는 선택적으로 추가 매개변수를 허용하며, 이는 대량 쓰기 (write) 작업을 구성하는 데 사용할 수 있는 옵션을 나타냅니다.

컬렉션 대량 쓰기 옵션

다음 표에서는 Collection.bulk_write() 메서드에 전달할 수 있는 옵션에 대해 설명합니다.

속성	설명
`ordered`	If `True`, the driver performs the write operations in the order provided. If an error occurs, the remaining operations are not attempted. If `False`, the driver performs the operations in an arbitrary order and attempts to perform all operations. Defaults to `True`.
`bypass_document_validation`	Specifies whether the operation bypasses document-level validation. For more information, see Schema Validation in the MongoDB Server manual. Defaults to `False`.
`session`	An instance of `ClientSession`. For more information, see the API documentation.
`comment`	A comment to attach to the operation. For more information, see the delete command fields guide in the MongoDB Server manual.
`let`	A map of parameter names and values. Values must be constant or closed expressions that don't reference document fields. For more information, see the let statement in the MongoDB Server manual.

다음 예시 bulk_write() 앞의 컬렉션 대량 쓰기 예제에서 메서드를 호출하지만 ordered 옵션을 로 False 설정합니다.

results = restaurants.bulk_write(operations, ordered=False)

순서가 지정되지 않은 대량 쓰기 (write) 의 쓰기 (write) 작업 중 하나라도 실패하면 PyMongo 는 모든 작업을 시도한 후에만 오류를 보고합니다.

참고

순서가 지정되지 않은 대량 작업은 실행 순서가 보장되지 않습니다. 이 순서는 런타임을 최적화하기 위해 나열한 방식과 다를 수 있습니다.

클라이언트 대량 쓰기 옵션

다음 표에서는 MongoClient.bulk_write() 메서드에 전달할 수 있는 옵션에 대해 설명합니다.

속성	설명
`session`	An instance of `ClientSession`. For more information, see the API documentation.
`ordered`	If `True`, the driver performs the write operations in the order provided. If an error occurs, the remaining operations are not attempted. If `False`, the driver performs the operations in an arbitrary order and attempts to perform all operations. Defaults to `True`.
`verbose_results`	Specifies whether the operation returns detailed results for each successful operation. Defaults to `False`.
`bypass_document_validation`	Specifies whether the operation bypasses document-level validation. For more information, see Schema Validation in the MongoDB Server manual. Defaults to `False`.
`comment`	A comment to attach to the operation. For more information, see the delete command fields guide in the MongoDB Server manual.
`let`	A map of parameter names and values. Values must be constant or closed expressions that don't reference document fields. For more information, see the let statement in the MongoDB Server manual.
`write_concern`	Specifies the write concern to use for the bulk operation. For more information, see Write Concern in the MongoDB Server manual.

다음 예시 bulk_write() 앞의 클라이언트 대량 쓰기 예제에서 메서드를 호출하지만 verbose_results 옵션을 로 True 설정합니다.

results = client.bulk_write(operations, verbose_results=True)

ClientBulkWriteResult({'anySuccessful': True, 'error': None, 'writeErrors': [],
'writeConcernErrors': [], 'nInserted': 1, 'nUpserted': 0, 'nMatched': 1, 'nModified': 1,
'nDeleted': 344, 'insertResults': {0: InsertOneResult(ObjectId('...'),
acknowledged=True)}, 'updateResults': {1: UpdateResult({'ok': 1.0, 'idx': 1, 'n': 1,
'nModified': 1}, acknowledged=True)}, 'deleteResults': {2: DeleteResult({'ok': 1.0,
'idx': 2, 'n': 344}, acknowledged=True)}}, acknowledged=True, verbose=True)

Return Values

이 섹션에서는 다음과 같은 대량 작업 메서드의 반환 값에 대해 설명합니다.

Collection.bulk_write()
MongoClient.bulk_write()

컬렉션 대량 쓰기 반환 값

Collection.bulk_write() 메서드는 BulkWriteResult 객체 를 반환합니다. BulkWriteResult 객체 에는 다음과 같은 속성이 포함되어 있습니다.

속성	설명
`acknowledged`	Indicates if the server acknowledged the write operation.
`bulk_api_result`	The raw bulk API result returned by the server.
`deleted_count`	The number of documents deleted, if any.
`inserted_count`	The number of documents inserted, if any.
`matched_count`	The number of documents matched for an update, if applicable.
`modified_count`	The number of documents modified, if any.
`upserted_count`	The number of documents upserted, if any.
`upserted_ids`	A map of the operation's index to the `_id` of the upserted documents, if applicable.

클라이언트 대량 쓰기 반환 값

MongoClient.bulk_write() 메서드는 ClientBulkWriteResult 객체 를 반환합니다. ClientBulkWriteResult 객체 에는 다음과 같은 속성이 포함되어 있습니다.

속성	설명
`acknowledged`	Indicates if the server acknowledged the write operation.
`bulk_api_result`	The raw bulk API result returned by the server.
`delete_results`	A map of any successful delete operations and their results.
`deleted_count`	The number of documents deleted, if any.
`has_verbose_results`	Indicates whether the returned results are verbose.
`insert_results`	A map of any successful insert operations and their results.
`inserted_count`	The number of documents inserted, if any.
`matched_count`	The number of documents matched for an update, if applicable.
`modified_count`	The number of documents modified, if any.
`update_results`	A map of any successful update operations and their results.
`upserted_count`	The number of documents upserted, if any.

문제 해결

클라이언트 유형 주석

MongoClient 객체 에 대한 유형 주석을 추가하지 않으면 유형 검사기에 다음과 유사한 오류가 표시될 수 있습니다.

from pymongo import MongoClient
client = MongoClient()  # error: Need type annotation for "client"

해결책은 MongoClient 객체 에 client: MongoClient 또는 client: MongoClient[Dict[str, Any]]로 주석을 추가하는 것입니다.

호환되지 않는 유형

유형 힌트로 MongoClient 를 지정했지만 문서, 키 및 값에 대한 데이터 유형을 포함하지 않는 경우 유형 검사기에 다음과 유사한 오류가 표시될 수 있습니다.

error: Dict entry 0 has incompatible type "str": "int";
expected "Mapping[str, Any]": "int"

해결 방법은 MongoClient 객체 에 다음 유형 힌트를 추가하는 것입니다.

``client: MongoClient[Dict[str, Any]]``

추가 정보

개별 쓰기 작업을 수행하는 방법을 알아보려면 다음 가이드를 참조하세요.

API 문서

이 가이드에서 설명하는 메서드나 유형에 대해 자세히 알아보려면 다음 API 설명서를 참조하세요.

돌아가기

문서 삭제

트랜잭션