Is it possible to use the $search aggregation pipeline stage to efficiently filter out documents with some field not containing, for example, a certain object id, and then perform full-text search on other fields? By efficient I mean that the documents with the desired object id in the given field can be found without scanning all indexed data, similar to how a database index on the given field would avoid a collection scan.
To better explain my question, I’ll present a simplified scenario.
Consider a stores collection and a products collection. The document schema for both collections is as follows:
// Stores schema:
{
_id: ObjectId,
name: String
}
// Products schema:
{
_id: ObjectId,
name: String,
store: ObjectId
}
Every product has a name and belongs to a store.
Consider an application where the user is able to choose a store, and then full-text seach for products in that store by name.
To achieve this, I’d create the following search index:
{
collectionName: 'products',
name: 'productName_index',
mappings: {
dynamic: false,
fields: {
store: {
type: "objectId",
},
name: [
{ type: "string" },
{ type: "autocomplete" }
]
}
}
}
And use the following aggregation pipeline to query:
// Known store _id
const storeId = new ObjectId()
const searchQuery = "someProductName"
const pipeline = {
$search: {
index: "productName_index",
compound: {
filter: [
{ equals: {
path: "store",
query: storeId
}}
],
should: [
{ text: {
path: "name",
query: searchQuery
}},
{ autocomplete: {
path: "name",
query: searchQuery
}}
],
minimumShouldMatch: 1
}
}
}
I think that for this query, all indexed data for the productsName_index
is scanned.
If instead, I were to use a compound database index: { store: 1, name: 1 }
, I could use an aggregation pipeline with a $match stage to filter out products that do not belong to a store, without performing a collection scan. But then, I would no longer be able to full-text search.
So then, how does it work with search indexes? Would the above query have to check every indexed store field? If so, I’m curious if it’d ever be possible to build a search index that supports this kind of queries more efficiently.