Collations
On this page
Collations are available in MongoDB 3.4 and later.
Overview
This guide shows you how to use collations, a set of sorting rules, to run operations using string ordering for specific languages and locales (a community or region that shares common language idioms).
MongoDB sorts strings using binary collation by default. This collation method uses the ASCII standard character values to compare and order strings. Languages and locales have specific character ordering conventions that differ from the ASCII standard.
For example, in Canadian French, the right-most accented character determines the ordering for strings when the other characters are the same. Consider the following French words: cote, coté, côte, and côté.
MongoDB sorts them in the following order using the default binary collation:
cote coté côte côté
MongoDB sorts them in the following order using the Canadian French collation:
cote côte coté côté
Usage
You can specify a collation when you create a new collection or new index. You can also specify a collation for CRUD operations and aggregations.
When you create a new collection with a collation, you define the default collation for any of the operations that support collation called on that collection. You can override the collation for an operation by specifying a different one.
Note
Currently, you cannot create a collation on an existing collection. To use collations with an existing collection, create an index with the collation and specify the same collation in your operations on it.
When you create an index with a collation, you specify the sort order for operations that use that index. To use the collation in the index, you must provide a matching collation in the operation, and the operation must use the index. While most index types support collation, the following types support only binary comparison:
Collation Parameters
The collation object contains the following parameters:
collation: { locale: <string>, caseLevel: <bool>, caseFirst: <string>, strength: <int>, numericOrdering: <bool>, alternate: <string>, maxVariable: <string>, backwards: <bool> }
You must specify the locale
field in the collation; all other fields
are optional. For a complete list of supported locales and the default values
for the locale
fields, see Supported Languages and Locales.
For descriptions of each field, see the Collation Document MongoDB
manual entry.
Collation Examples
Set a Default Collation on a Collection
In the following example, we create a new collection called souvenirs
and
assign a default collation with the "fr_CA"
locale. The collation applies
to all operations that support collation performed on that
collection.
db.createCollection("souvenirs", { collation: { locale: "fr_CA" }, });
Any of the operations that support collations automatically apply the collation
defined on the collection. The query below searches the souvenirs
collection and applies the "fr_CA"
locale collation:
myColl.find({type: "photograph"});
You can specify a different collation as a parameter in an operation that
supports collations. The following query specifies the "is"
Iceland locale
and caseFirst
optional parameter with the value "upper"
:
myColl.find({type: "photograph"}, { collation: { locale: "is", caseFirst: "upper" } } );
Assign a Collation to an Index
In the following example, we create a new index on the title
field of
a collection with a collation set to the "en_US"
locale.
myColl.createIndex( { 'title' : 1 }, { 'collation' : { 'locale' : 'en_US' } });
The following query uses the index we created:
myColl.find({"year": 1980}, {"collation" : {"locale" : "en_US" }}) .sort({"title": -1});
The following queries do not use the index that we created. The first query does not include a collation and the second contains a different strength value than the collation on the index.
myColl.find({"year": 1980}, {"collation" : {"locale" : "en_US", "strength": 2 }}) .sort({"title": -1});
myColl.find({"year": 1980}) .sort({"title": -1});
Collation Query Examples
Operations that read, update, and delete documents from a collection can use collations. This section includes examples of a selection of these. See the MongoDB manual for a full list of operations that support collation.
find() and sort() Example
The following example calls both find()
and sort()
on a collection
that uses the default binary collation. We use the German collation by
setting the value of the locale
parameter to "de"
.
myColl.find({ city: "New York" }, { collation: { locale: "de" } }) .sort({ name: 1 });
findOneAndUpdate() Example
The following example calls the findOneAndUpdate()
operation on a
collection that uses the default binary collation. The collection contains the
following documents:
{ "_id" : 1, "first_name" : "Hans" } { "_id" : 2, "first_name" : "Gunter" } { "_id" : 3, "first_name" : "Günter" } { "_id" : 4, "first_name" : "Jürgen" }
Consider the following findOneAndUpdate()
operation on this collection
which does not specify a collation:
myColl.findOneAndUpdate( { first_name : { $lt: "Gunter" } }, { $set: { verified: true } } );
Since "Gunter" is the first sorted result when using a binary collation, none
of the documents come lexically before and match the $lt
comparison
operator in the query document. As a result, the operation does not update any
documents.
Consider the same operation with a collation specified with the locale set to
de@collation=phonebook
. This locale specifies the collation=phonebook
option which contains rules for prioritizing proper nouns, identified by
capitalization of the first letter. The de@collation=phonebook
locale and
option sorts characters with umlauts before the same characters without
umlauts.
myColl.findOneAndUpdate( { first_name: { $lt: "Gunter" } }, { $set: { verified: true } }, { collation: { locale: "de@collation=phonebook" } }, );
Since "Günter" lexically comes before "Gunter" using the
de@collation=phonebook
collation specified in findOneAndUpdate()
,
the operation returns the following updated document:
{ lastErrorObject: { updatedExisting: true, n: 1 }, value: { _id: 3, first_name: 'Günter' }, ok: 1 }
findOneAndDelete() Example
The following example calls the findOneAndDelete()
operation on a
collection that uses the default binary collation and contains the following
documents:
{ "_id" : 1, "a" : "16" } { "_id" : 2, "a" : "84" } { "_id" : 3, "a" : "179" }
In this example, we set the numericOrdering
collation parameter to true
to sort numeric strings based on their numerical order instead of their
lexical order.
myColl.findOneAndDelete( { a: { $gt: "100" } }, { collation: { locale: "en", numericOrdering: true } }, );
After you run the operation above, the collection contains the following documents:
{ "_id" : 1, "a" : "16" } { "_id" : 2, "a" : "84" }
If you perform the same operation without collation on the original
collection of three documents, it matches documents based on the lexical value
of the strings ("16"
, "84"
, and "179"
), and deletes the first
document it finds that matches the query criteria.
await myColl.findOneAndDelete({ a: { $gt: "100" } });
Since all the documents contain lexical values in the a
field that
match the criteria (greater than the lexical value of "100"
), the operation
removes the first result. After you run the operation above, the collection
contains the following documents:
{ "_id" : 2, "a" : "84" } { "_id" : 3, "a" : "179" }
Aggregation Example
To use collation with the aggregate operation, pass the collation document in the options field, after the array of pipeline stages.
The following example shows an aggregation pipeline on a collection that uses
the default binary collation. The aggregation groups the first_name
field,
counts the total number of results in each group, and sorts the results by
the German phonebook ("de@collation=phonebook"
locale) order.
Note
You can specify only one collation on an aggregation.
myColl.aggregate( [ { $group: { "_id": "$first_name", "nameCount": { "$sum": 1 } } }, { $sort: { "_id": 1 } }, ], { collation: { locale: "de@collation=phonebook" } }, );