I have a pool collection and am trying to find a match for the Name field of the pool collection. I also want my query to work with case-insensitive and diacritic case-insensitive data.
I have a pool with Name=‘Maøé’ in my pool collection.
I have used the below queries:
The below query did not return any matching result when I am using regex.
db["Pool"].find({TenantId: BinData(3, "fjTbsQVA40CM5UsUlkTkDQ=="), OwnerId: BinData(3,"cDcPV5whymFeZKMg+8MN7w=="), Name: { $regex: "maoe", $options: "i" }}).collation({locale:"en", strength: 1})
The below query returned the expected pool.
db["Pool"].find({TenantId: BinData(3, "fjTbsQVA40CM5UsUlkTkDQ=="), OwnerId: BinData(3,"cDcPV5whymFeZKMg+8MN7w=="), Name: { $regex: "Maøé", $options: "i" }}).collation({locale:"en", strength: 1})
The below query returned the expected pool when and is working with the diacritic-insensitive search but here I have to use the exact match i.e. full-text search.
db["Pool"].find({TenantId: BinData(3, "fjTbsQVA40CM5UsUlkTkDQ=="), OwnerId: BinData(3,"cDcPV5whymFeZKMg+8MN7w=="), Name: "Maoe"}).collation({locale:"en", strength: 1})
The output of the query is given below
{
_id: Binary.createFromBase64('FI2HAMcF/E6etfRWWcZ7Fg==', 3),
AggregateVersion: 2,
AuditableEvents: [],
LastModified: 2024-12-19T13:05:05.266Z,
DatabaseVersion: 638302904038095700,
TenantId: Binary.createFromBase64('fjTbsQVA40CM5UsUlkTkDQ==', 3),
CreatedAt: 2024-12-19T13:04:47.661Z,
EntityReferences: [
{
Name: 'OwnerId',
Key: '570f3770-219c-61ca-5e64-a320fbc30def'
},
{
Name: 'SharedOwnerships.InternalEmployeeId',
Key: '56bf3fcb-e7f4-8c92-e1e1-370c26f5bcc7'
},
{
Name: 'SharedOwnerships.InternalEmployeeId',
Key: '0d9c0a0b-f9ba-b92b-411f-a33312530aff'
}
],
Name: 'Maøé',
OwnerId: Binary.createFromBase64('cDcPV5whymFeZKMg+8MN7w==', 3),
SharedOwnerships: [
{
InternalEmployeeId: Binary.createFromBase64('yz+/VvTnkozh4TcMJvW8xw==', 3),
DateAdded: 2024-12-19T13:05:05.249Z
},
{
InternalEmployeeId: Binary.createFromBase64('CwqcDbr5K7lBH6MzElMK/w==', 3),
DateAdded: 2024-12-19T13:05:05.254Z
}
]
}
It appears that collation is not functioning as expected when using regex in my query. Specifically:
-
Diacritic-Insensitive and Case-Insensitive Search: These features do not seem to work properly when a regex-based query is used, even though collation is applied.
-
Why does collation fail to enable diacritic-insensitive and case-insensitive search in combination with regex?
Could this be a limitation of MongoDB’s current implementation of regex and collation? -
Is there an alternative approach to implement diacritic-insensitive and case-insensitive search effectively in MongoDB, especially for scenarios where regex is required?