Hello together,
I can’t get any further with my problem and hope to get help here.
I have a collection of documents that are structured as follows:
{
"number": 0,
"title": "The Great Adventure",
"author": "Alex Johnson",
"description": "A thrilling story of a journey through uncharted territories.",
"location": "somewhere",
"embeddedObj": {
"bla1":"asdf",
"bla2": "asdf"
},
"embeddedArr":[
{
"bla3":"asdf",
"bla4": "asdf"
},
{
"bla3":"asdf",
"bla4": "asdf"
}
]
}
for example, if you search for “lodz”, documents should be found if they have one of the following contents, no matter where:
“lodz”
“Lodz”
“Łódź”
“asdfŁódź”
“Łódźasdf”
“fdsaŁódźasdf”
“Łódź asdf”
“asdf Łódź”
“asdf Łódź asdf”
i have no experience with mongodb and have been reading up on the subject for a week, asking “Ask MongoDB AI”, and have tried and combined a lot, such as icuNormalize charFilter, icuFolding tokenizer, nGram, but as a beginner it is difficult for me to get the right thing out of all the tutorials, ai suggestions etc. and put it together into an index that really works.
my question is, what does the search index look like in json form, and what does the search look like in json form?
It should also be noted that this is a first step. the actual document structure will be more complex and when everything is up and running, 200,000 new documents will be added per day in production, about 10 times as many document updates. old documents (older than one week) will no longer receive any changes. after 3 months, the docs will be deleted or tagged as deleted.
so performance is important. but for me, any solution would be enough for my proof of concept. performance would come later for me.
many thanks in advance for your support