Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Learn why MongoDB was selected as a leader in the 2024 Gartner® Magic Quadrant™
MongoDB Developer
MongoDB
plus
Sign in to follow topics
MongoDB Developer Center
chevron-right
Developer Topics
chevron-right
Products
chevron-right
MongoDB
chevron-right

Improved Error Messages for Schema Validation in MongoDB 5.0

Katya Kamenieva10 min read • Published Jan 10, 2022 • Updated Jun 14, 2023
MongoDBSchema
Facebook Icontwitter iconlinkedin icon
Rate this announcement
star-empty
star-empty
star-empty
star-empty
star-empty

Intro

Many MongoDB users rely on schema validation to enforce rules governing the structure and integrity of documents in their collections. But one of the challenges they faced was quickly understanding why a document that did not match the schema couldn't be inserted or updated. This is changing in the upcoming MongoDB 5.0 release.
Schema validation ease-of-use will be significantly improved by generating descriptive error messages whenever an operation fails validation. This additional information provides valuable insight into which parts of a document in an insert/update operation failed to validate against which parts of a collection's validator, and how. From this information, you can quickly identify and remediate code errors that are causing documents to not comply with your validation rules. No more tedious debugging by slicing your document into pieces to isolate the problem!
If you would like to evaluate this feature and provide us early feedback, fill in this form to participate in the preview program.
The most popular way to express the validation rules is JSON Schema. It is a widely adopted standard that is also used within the REST API specification and validation. And in MongoDB, you can combine JSON Schema with the MongoDB Query Language (MQL) to do even more.
In this post, I would like to go over a few examples to reiterate the capabilities of schema validation and showcase the addition of new detailed error messages.

What Do the New Error Messages Look Like?

First, let's look at the new error message. It is a structured message in the BSON format, explaining which part of the document didn't match the rules and which validation rule caused this.
Consider this basic validator that ensures that the price field does not accept negative values. In JSON Schema, the property is the equivalent of what we call "field" in MongoDB.
1{
2 "$jsonSchema": {
3 "properties": {
4 "price": {
5 "minimum": 0
6 }
7 }
8 }
9}
When trying to insert a document with {price: -2}, the following error message will be returned.
1{
2 "code": 121,
3 "errmsg": "Document failed validation",
4 "errInfo": {
5 "failingDocumentId": ObjectId("5fe0eb9642c10f01eeca66a9"),
6 "details": {
7 "operatorName": "$jsonSchema",
8 "schemaRulesNotSatisfied": [
9 {
10 "operatorName": "properties",
11 "propertiesNotSatisfied": [
12 {
13 "propertyName": "price",
14 "details": [
15 {
16 "operatorName": "minimum",
17 "specifiedAs": {
18 "minimum": 0
19 },
20 "reason": "comparison failed",
21 "consideredValue": -2
22 }
23 ]
24 }
25 ]
26 }
27 ]
28 }
29 }
30}
Some of the key fields in the response are:
  • failingDocumentId - the _id of the document that was evaluated
  • operatorName - the operator used in the validation rule
  • propertiesNotSatisfied - the list of fields (properties) that failed validation checks
  • propertyName - the field of the document that was evaluated
  • specifiedAs - the rule as it was expressed in the validator
  • reason - explanation of how the rule was not satisfied
  • consideredValue - value of the field in the document that was evaluated
The error may include more fields depending on the specific validation rule, but these are the most common. You will likely find the propertyName and reason to be the most useful fields in the response.
Now we can look at the examples of the different validation rules and see how the new detailed message helps us identify the reason for the validation failure.

Exploring a Sample Collection

As an example, we'll use a collection of real estate properties in NYC managed by a team of real estate agents.
Here is a sample document:
1{
2 "PID": "EV10010A1",
3 "agents": [ { "name": "Ana Blake", "email": "anab@rcgk.com" } ],
4 "description": "Spacious 2BR apartment",
5 "localization": { "description_es": "Espacioso apartamento de 2 dormitorios" },
6 "type": "Residential",
7 "address": {
8 "street1": "235 E 22nd St",
9 "street2": "Apt 42",
10 "city": "New York",
11 "state": "NY",
12 "zip": "10010"
13 },
14 "originalPrice": 990000,
15 "discountedPrice": 980000,
16 "geoLocation": [ -73.9826509, 40.737499 ],
17 "listedDate": "Wed Dec 11 2020 10:05:10 GMT-0500 (EST)",
18 "saleDate": "Wed Dec 21 2020 12:00:04 GMT-0500 (EST)",
19 "saleDetails": {
20 "price": 970000,
21 "buyer": { "id": "24434" },
22 "bids": [
23 {
24 "price": 950000,
25 "winner": false,
26 "bidder": {
27 "id": "24432",
28 "name": "Sam James",
29 "contact": { "email": "sjames@gmail.com" }
30 }
31 },
32 {
33 "price": 970000,
34 "winner": true,
35 "bidder": {
36 "id": "24434",
37 "name": "Joana Miles",
38 "contact": { "email": "jm@gmail.com" }
39 }
40 }
41 ]
42 }
43}

Using the Value Pattern

Our real estate properties are identified with property id (PID) that has to follow a specific naming format: It should start with two letters followed by five digits, and some letters and digits after, like this: WS10011FG4 or EV10010A1.
We can use JSON Schema pattern operator to create a rule for this as a regular expression.
Validator:
1{
2 "$jsonSchema": {
3 "properties": {
4 "PID": {
5 "bsonType": "string",
6 "pattern": "^[A-Z]{2}[0-9]{5}[A-Z]+[0-9]+$"
7 }
8 }
9 }
10}
If we try to insert a document with a PID field that doesn't match the pattern, for example { PID: "apt1" }, we will receive an error.
The error states that the field PID had the value of "apt1" and it did not match the regular expression, which was specified as "^[A-Z]{2}[0-9]{5}[A-Z]+[0-9]+$".
1{ ...
2 "schemaRulesNotSatisfied": [
3 {
4 "operatorName": "properties",
5 "propertiesNotSatisfied": [
6 {
7 "propertyName": "PID",
8 "details": [
9 {
10 "operatorName": "pattern",
11 "specifiedAs": {
12 "pattern": "^[A-Z]{2}[0-9]{5}[A-Z]+[0-9]+$"
13 },
14 "reason": "regular expression did not match",
15 "consideredValue": "apt1"
16 }
17 ]
18 }
19 ]
20 ...
21}

Additional Properties and Property Pattern

The description may be localized into several languages. Currently, our application only supports Spanish, German, and French, so the localization object can only contain fields description_es, description_de, or description_fr. Other fields will not be allowed.
We can use operator patternProperties to describe this requirement as regular expression and indicate that no other fields are expected here with "additionalProperties": false.
Validator:
1{
2 "$jsonSchema": {
3 "properties": {
4 "PID": {...},
5 "localization": {
6 "additionalProperties": false,
7 "patternProperties": {
8 "^description_(es|de|fr)+$": {
9 "bsonType": "string"
10 }
11 }
12 }
13 }
14 }
15}
Document like this can be inserted successfully:
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "localization": {
5 "description_es": "Amplio apartamento de 2 dormitorios",
6 "description_de": "Geräumige 2-Zimmer-Wohnung",
7 }
8}
Document like this will fail the validation check:
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "localization": {
5 "description_cz": "Prostorný byt 2 + kk"
6 }
7}
The error below indicates that field localization contains additional property description_cz. description_cz does not match the expected pattern, so it is considered an additional property.
1{ ...
2 "propertiesNotSatisfied": [
3 {
4 "propertyName": "localization",
5 "details": [
6 {
7 "operatorName": "additionalProperties",
8 "specifiedAs": {
9 "additionalProperties": false
10 },
11 "additionalProperties": [
12 "description_cz"
13 ]
14 }
15 ]
16 }
17 ]
18...
19}

Enumeration of Allowed Options

Each real estate property in our collection has a type, and we want to use one of the four types: "Residential," "Commercial," "Industrial," or "Land." This can be achieved with the operator enum.
Validator:
1{
2 "$jsonSchema": {
3 "properties": {
4 "type": {
5 "enum": [ "Residential", "Commercial", "Industrial", "Land" ]
6 }
7 }
8 }
9}
The following document will be considered invalid:
1{
2 "PID": "TS10018A1", "type": "House"
3}
The error states that field type failed validation because "value was not found in enum."
1{...
2 "propertiesNotSatisfied": [
3 {
4 "propertyName": "type",
5 "details": [
6 {
7 "operatorName": "enum",
8 "specifiedAs": {
9 "enum": [
10 "Residential",
11 "Commercial",
12 "Industrial",
13 "Land"
14 ]
15 },
16 "reason": "value was not found in enum",
17 "consideredValue": "House"
18 }
19 ]
20 }
21 ]
22...
23}

Arrays: Enforcing Number of Elements and Uniqueness

Agents who manage each real estate property are stored in the agents array. Let's make sure there are no duplicate elements in the array, and no more than three agents are working with the same property. We can use uniqueItems and maxItems for this.
1{
2 "$jsonSchema": {
3 "properties": {
4 "agents": {
5 "bsonType": "array",
6 "uniqueItems": true,
7 "maxItems": 3
8 }
9 }
10 }
11}
The following document violates both if the validation rules.
1{
2 "PID": "TS10018A1",
3 "agents": [
4 { "name": "Ana Blake" },
5 { "name": "Felix Morin" },
6 { "name": "Dilan Adams" },
7 { "name": "Ana Blake" }
8 ]
9}
The error returns information about failure for two rules: "array did not match specified length" and "found a duplicate item," and it also points to what value was a duplicate.
1{
2 ...
3 "propertiesNotSatisfied": [
4 {
5 "propertyName": "agents",
6 "details": [
7 {
8 "operatorName": "maxItems",
9 "specifiedAs": { "maxItems": 3 },
10 "reason": "array did not match specified length",
11 "consideredValue": [
12 { "name": "Ana Blake" },
13 { "name": "Felix Morin" },
14 { "name": "Dilan Adams" },
15 { "name": "Ana Blake" }
16 ]
17 },
18 {
19 "operatorName": "uniqueItems",
20 "specifiedAs": { "uniqueItems": true },
21 "reason": "found a duplicate item",
22 "consideredValue": [
23 { "name": "Ana Blake" },
24 { "name": "Felix Morin" },
25 { "name": "Dilan Adams" },
26 { "name": "Ana Blake" }
27 ],
28 "duplicatedValue": { "name": "Ana Blake" }
29 }
30 ]
31 ...
32 }

Enforcing Required Fields

Now, we want to make sure that there's contact information available for the agents. We need each agent's name and at least one way to contact them: phone or email. We will use requiredand anyOf to create this rule.
Validator:
1{
2 "$jsonSchema": {
3 "properties": {
4 "agents": {
5 "bsonType": "array",
6 "uniqueItems": true,
7 "maxItems": 3,
8 "items": {
9 "bsonType": "object",
10 "required": [ "name" ],
11 "anyOf": [ { "required": [ "phone" ] }, { "required": [ "email" ] } ]
12 }
13 }
14 }
15 }
16}
The following document will fail validation:
1{
2 "PID": "TS10018A1",
3 "agents": [
4 { "name": "Ana Blake", "email": "anab@rcgk.com" },
5 { "name": "Felix Morin", "phone": "+12019878749" },
6 { "name": "Dilan Adams" }
7 ]
8}
Here the error indicates that the third element of the array ("itemIndex": 2) did not match the rule.
1{
2 ...
3 "propertiesNotSatisfied": [
4 {
5 "propertyName": "agents",
6 "details": [
7 {
8 "operatorName": "items",
9 "reason": "At least one item did not match the sub-schema",
10 "itemIndex": 2,
11 "details": [
12 {
13 "operatorName": "anyOf",
14 "schemasNotSatisfied": [
15 {
16 "index": 0,
17 "details": [
18 {
19 "operatorName": "required",
20 "specifiedAs": { "required": [ "phone" ] },
21 "missingProperties": [ "phone" ]
22 }
23 ]
24 },
25 {
26 "index": 1,
27 "details": [
28 {
29 "operatorName": "required",
30 "specifiedAs": { "required": [ "email" ] },
31 "missingProperties": [ "email" ]
32 }
33 ]
34 }
35 ]
36 }
37 ]
38 }
39 ]
40 }
41 ]
42...
43}

Creating Dependencies

Let's create another rule to ensure that if the document contains the saleDate field, saleDetails is also present, and vice versa: If there is saleDetails, then saleDate also has to exist.
1{
2 "$jsonSchema": {
3 "dependencies": {
4 "saleDate": [ "saleDetails"],
5 "saleDetails": [ "saleDate"]
6 }
7 }
8}
Now, let's try to insert the document with saleDate but with no saleDetails:
1{
2 "PID": "TS10018A1",
3 "saleDate": Date("2020-05-01T04:00:00.000Z")
4}
The error now includes the property with dependency saleDate and a property missing from the dependencies: saleDetails.
1{
2 ...
3 "details": {
4 "operatorName": "$jsonSchema",
5 "schemaRulesNotSatisfied": [
6 {
7 "operatorName": "dependencies",
8 "failingDependencies": [
9 {
10 "conditionalProperty": "saleDate",
11 "missingProperties": [ "saleDetails" ]
12 }
13 ]
14 }
15 ]
16 }
17...
18}
Notice that in JSON Schema, the field dependencies is in the root object, and not inside of the specific property. Therefore in the error message, the details object will have a different structure:
1{ "operatorName": "dependencies", "failingDependencies": [...]}
In the previous examples, when the JSON Schema rule was inside of the "properties" object, like this:
1"$jsonSchema": { "properties": { "price": { "minimum": 0 } } }
the details of the error message contained "operatorName": "properties" and a "propertyName":
1{ "operatorName": "properties",
2 "propertiesNotSatisfied": [ { "propertyName": "...", "details": [] } ]
3}

Adding Business Logic to Your Validation Rules

You can use MongoDB Query Language (MQL) in your validator right next to JSON Schema to add richer business logic to your rules.
As one example, you can use $expr to add a check for a discountPrice to be less than originalPrice just like this:
1{
2 "$expr": {
3 "$lt": [ "$discountedPrice", "$originalPrice" ]
4 },
5 "$jsonSchema": {...}
6}
$expr resolves to true or false, and allows you to use aggregation expressions to create sophisticated business rules.
For a little more complex example, let's say we keep an array of bids in the document of each real estate property, and the boolean field isWinner indicates if a particular bid is a winning one.
Sample document:
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "saleDetails": {
5 "bids": [
6 {
7 "price": 500000,
8 "isWinner": false,
9 "bidder": {...}
10 },
11 {
12 "price": 530000,
13 "isWinner": true,
14 "bidder": {...}
15 }
16 ]
17 }
18}
Let's make sure that only one of the bids array elements can be marked as the winner. The validator will have an expression where we apply a filter to the array of bids to only keep the elements with "isWinner": true, and check the size of the resulting array to be less or equal to 1.
Validator:
1{
2 "$and": [
3 {
4 "$expr": {
5 "$lte": [
6 {
7 "$size": {
8 "$filter": {
9 "input": "$saleDetails.bids.isWinner",
10 "cond": "$$this"
11 }
12 }
13 },
14 1
15 ]
16 }
17 },
18 {
19 "$expr": {...}
20 },
21 {
22 "$jsonSchema": {...}
23 }
24 ]
25}
Let's try to insert the document with few bids having "isWinner": true.
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "originalPrice": 600000,
5 "discountedPrice": 550000,
6 "saleDetails": {
7 "bids": [
8 { "price": 500000, "isWinner": true },
9 { "price": 530000, "isWinner": true }
10 ]
11 }
12}
The produced error message will indicate which expression evaluated to false.
1{
2...
3 "details": {
4 "operatorName": "$expr",
5 "specifiedAs": {
6 "$expr": {
7 "$lte": [
8 {
9 "$size": {
10 "$filter": {
11 "input": "$saleDetails.bids.isWinner",
12 "cond": "$$this"
13 }
14 }
15 },
16 1
17 ]
18 }
19 },
20 "reason": "expression did not match",
21 "expressionResult": false
22 }
23...
24}

Geospatial Validation

As the last example, let's see how we can use the geospatial features of MQL to ensure that all the real estate properties in the collection are located within the New York City boundaries. Our documents include a geoLocation field with coordinates. We can use $geoWithin to check that these coordinates are inside the geoJSON polygon (the polygon for New York City in this example is approximate).
Validator:
1{
2 "geoLocation": {
3 "$geoWithin": {
4 "$geometry": {
5 "type": "Polygon",
6 "coordinates": [
7 [ [ -73.91326904296874, 40.91091803848203 ],
8 [ -74.01626586914062, 40.75297891717686 ],
9 [ -74.05677795410156, 40.65563874006115 ],
10 [ -74.08561706542969, 40.65199222800328 ],
11 [ -74.14329528808594, 40.64417760251725 ],
12 [ -74.18724060058594, 40.643656594948524 ],
13 [ -74.234619140625, 40.556591288249905 ],
14 [ -74.26345825195312, 40.513277131087484 ],
15 [ -74.2510986328125, 40.49500373230525 ],
16 [ -73.94691467285156, 40.543026009954986 ],
17 [ -73.740234375, 40.589449604232975 ],
18 [ -73.71826171874999, 40.820045086716505 ],
19 [ -73.78829956054686, 40.8870435151357 ],
20 [ -73.91326904296874, 40.91091803848203 ] ]
21 ]
22 }
23 }
24 },
25 "$jsonSchema": {...}
26}
A document like this will be inserted successfully.
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "geoLocation": [ -73.9826509, 40.737499 ],
5 "originalPrice": 600000,
6 "discountedPrice": 550000,
7 "saleDetails": {...}
8}
The following document will fail.
1{
2 "PID": "TS10018A1",
3 "type": "Residential",
4 "geoLocation": [ -73.9826509, 80.737499 ],
5 "originalPrice": 600000,
6 "discountedPrice": 550000,
7 "saleDetails": {...}
8}
The error will indicate that validation failed the $geoWithin operator, and the reason is "none of the considered geometries were contained within the expression's geometry."
1{
2...
3 "details": {
4 "operatorName": "$geoWithin",
5 "specifiedAs": {
6 "geoLocation": {
7 "$geoWithin": {...}
8 }
9 },
10 "reason": "none of the considered geometries were contained within the
11 expression's geometry",
12 "consideredValues": [ -73.9826509, 80.737499 ]
13 }
14...
15}

Conclusion and Next Steps

Schema validation is a great tool to enforce governance over your data sets. You have the choice to express the validation rules using JSON Schema, MongoDB Query Language, or both. And now, with the detailed error messages, it gets even easier to use, and you can have the rules be as sophisticated as you need, without the risk of costly maintenance.
You can find the full validator code and sample documents from this post here.
If you would like to evaluate this feature and provide us early feedback, fill in this form to participate in the preview program.
More posts on schema validation:
Questions? Comments? We'd love to connect with you. Join the conversation on the MongoDB Community Forums.
Safe Harbor
The development, release, and timing of any features or functionality described for our products remains at our sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality.

Facebook Icontwitter iconlinkedin icon
Rate this announcement
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Use the Union All Aggregation Pipeline Stage in MongoDB 4.4


Sep 09, 2024 | 16 min read
Article

From Zero to Hero with MrQ


Jun 13, 2023 | 3 min read
Tutorial

Single-Collection Designs in MongoDB with Spring Data (Part 2)


Aug 12, 2024 | 10 min read
Tutorial

The Great Continuous Migration: CDC Jobs With Confluent Cloud and Relational Migrator


Aug 23, 2024 | 11 min read
Table of Contents