I have this aggregate which assigns a random number to the ‘order’ field in every document within the ‘data’ collection. (The point was to shuffle the order in which data is retrieved every once in a while.)
db.aggregate(
[
{ $set: { "order": { $multiply: [ { $rand: {} }, 200000 ] } } },
{ $set: { "order": { $floor: "$order" } } },
{ $merge: "data"}
]
)
I need to upgrade this to do things a bit differently:
1: Filter by some of the document fields to only assign the random numbers to a portion of the collection, not the entire collection.
2: Assign every generated random number to 10 documents, not 1. It doesn’t matter which batch gets what number, but each document within a batch should get the same number.
Please help me to understand how to do it.
Thank you.
Hi @notapolita ,
Its not a super stright forward idea for the mongoDB sever, but the aggregation framework is so rich that you can do the following:
db.data.aggregate([{
$match: {
<ANY_TYPE_CONDITION>
}
}, {
$setWindowFields: {
partitionBy: null,
sortBy: {
_id: 1
},
output: {
documentNumber: {
$documentNumber: {}
}
}
}
}, {
$group: {
_id: {
$floor: {
$divide: [
'$documentNumber',
10
]
}
},
result: {
$push: '$$ROOT'
}
}
}, {
$set: {
order: {
$floor: {
$multiply: [
{
$rand: {}
},
200000
]
}
}
}
}, {$sort : {order : 1}}],{"allowDiskUse" : true})
This aggregation will basically first use a match stage to filter on any filter expression that a $match can have, this will cover your first requirement
Then the next stage will actually document number each document using 5.0+ setWindowFields and then will group by devision of 10 creating a document with 10 documents grouped under “results”. Now we add the random number to each 10 groups and sort by it.
There is no need to do 2 $set as it actually does a full document pass twice try to use a minimal stages as possible.
Thanks
Pavel