I’m trying to use the full text search feature with my Atlas collection. That seems to be working great. I also want the results to be limited to documents where a certain field - that is an array - is empty/has a zero length. I’ve been looking over the docs for the aggregate search pipeline but nothing seems to handle this that I can find.
Previously, I was using the find() api like this:
collection.find({ tags: { $size: 0 } });
I feel like I should include this filter before the text search of other fields since it would greatly limit was needs… searched. What does that operation look like in an aggregate pipeline?
This is getting my a cursor of the result of docs where the title field does contain style, so that’s great! I was thinking this could be a compound operation since I also want to filter out the tags array field to empty ones but I might be mistaken there too.
All this means that your filtering of empty tags array must be done after the $search.
However, something that I have not tried, would be to set a partial index with your tags size as a condition and then specify the index to use in your $search. Since the documents indexed will only be the one with $size:{$gt:0}, you kind of implicitly filter them out.
Yes, exactly. It was this requirement that kind of baffled me. I thought match or equals might be the answer, but wasn’t sure how they were shaped.
Given that the text search has to be first, it does imply to me that all records will have their text fields inspected, and then whole swathes will be removed on the (in my mind) more simpler tags match. Is this not an inefficiency to mongo’s internals?
That is not exactly how search and indexes work. If you do not have an index, then yes each record has to be verified. With search and indexes, it is much more efficient. More or less the index is inspected, not the documents. Finding a value inside an index is very efficient compared to scanning each document. The course M201 presents a good overview of indexes.
For example, if a document does not contain the indexed field, it takes almost no space in the index and is not examined when looking for normal value.