I’m having troubles finding a query that would resolve my case. So I have a message collection that has room_id,user_id,content and created_at as parameters. What I would like to do is to make mongo text search but I would like to apply that search only to last 5 elements of the collection.
Right now I have this but I don’t think it is the right solution:
last_five_messages = (
self.message_collection.find({"user_id": user, "room_id": room})
.sort([("created_at", -1)])
.limit(5)
)
# Step 2: Extract message text from the last 5 messages
message_texts = [message["message"] for message in last_five_messages]
text_search_results = self.message_collection.find(
{
"$text": {"$search": escaped_message},
"user_id": user,
"room_id": room,
"message": {"$in": message_texts},
},
)
Can someone help me with this please?
I’m relatively new to Mongo so sorry if this isn’t the right place to question about this topic.
Hey @Pedro_Silva1 , can you share more about what’s not working for you right now? A few tips I would recommend looking into:
The Aggregation Pipeline allows you to perform complex operations in a single query.
For better performance, we recommend using the $search aggregation stage instead of $text. To use $search, you will need to create a search index, see how to do that here. Once you have a search index, you can use the compound operator to:
filter your search on user_id and room_id using the equals operator)
filter your search on message_texts using thein operator
Hello @amyjian thank you for the response! So in order to detail it a little further what I would want to do is to filter the collection to check messages of user_id in a room_id and order them by date and limit the query to fetch the last 5 messages the user sent and after that apply the full text search to that 5 messages to check if there is similar content. My objective with this is to check form spam messages. Is it possible to do with the suggestions you gave me?
Hello again @amyjian I was able to complete this query after subscribing to Mongo Atlas. However I some doubts that I would like to know if you could help me with:
So I’m using the following query to detetct if a user as produced similar content in my chat channel to avoid having spamming users:
However I’m finding the results to be to “agressive” since if I write content like “I love coffee” and then “My coffee is great” it would detect the content to be similar. What do you think I should no in this cases? Only show matches with a score above a certain value? If so which value is it ok to choose since I can’t find a way to understand how is the score calculated and what is the range of values available