Hello there!
My issue is quite complex and I’m new to mongodb. And I’m so close to the solution but one property still won’t display as intended. First i need to explain the setup. We have multiple collections:
Collections
- requests - Holiday requests from website visitors
- campaigns - A list of advert campaigns
- bookings - Holiday bookings
- flags - List of flags/tags which can be used for tagging documents of the other collections.
Examples document structures
Only the relevant props are shown.
// requests
{
_id: ObjectId("60be1a668cc56aca6c46bb24"),
dateRequest: '2021-06-07 13:08:54.675Z',
apikey: '7bcc49c49cb8afb4ba666a6ae2',
meta: {
campaign: '999999' // references to campaigns.id
}
}
// campaigns
{
_id: ObjectId("61af6485a1a6045e25baf1b1"),
id: '999999',
name: 'Brand International EU',
apikey: '7bcc49c49cb8afb4ba666a6ae2',
flag: ObjectId("616025c3e5bff7b3d6d89928") // reference to flags._id
}
// bookings
{
_id: ObjectId("61645257c03caa392af894e8"),
apikey: '7bcc49c49cb8afb4ba666a6ae2',
revenue: 742.34,
flags: [ // references to flags._id
ObjectId("61602615e5bff7b3d6d8992c"),
ObjectId("616025fee5bff7b3d6d8992b")
],
requests: [ // references to requests._id
ObjectId("60be1a668cc56aca6c46bb24"),
ObjectId("6160174c8fa5873c1e068f7a")
]
}
// flags
{
_id: ObjectId("61602615e5bff7b3d6d8992c"),
name: 'In House'
},
{
_id: ObjectId("616025fee5bff7b3d6d8992b"),
name: 'Dynamic'
}
Now the final output needs to be grouped by campaigns and needs to hold aggregated data for revenue, bookingscount, bookingflags, booking quota, etc. The goal is to have a table like this:
| campaign | requests | bookings | bookingquota | revenue | revenue/booking | revenue/request |
|------------------------|----------|----------|--------------|---------|-----------------|-----------------|
| null | 255 | 0 | 0 | 0 | 0 | 0 |
| Brand International EU | 350 | 50 | 0.14% | 12,650€ | 253€ | 0,03€ |
| 666666 (new) | 20 | 5 | 0.25% | 6,470€ | 1,294€ | 323.5% |
The following conditions need to be respected:
- Some requests may not include any campaign id, this requests should be grouped by
null
- Some requests may include a campaign id which does not exist in the campaign collection, therefore we group by the id instead of the object id (_id).
-
bookingFlags
can have duplicates because we would use this property to count how many times a booking flag is related to the campaign. -
earliestRequest
is the earliest dateRequest of requests inside the bookings. -
revenuetotal
is the sum of revenue values inside bookings. -
quota
is simply bookingcount divided by requestcount. -
revenueperbooking
andrevenueperrequest
are revenuetotal divided by each amount.
The desired array output should look something like this (I think):
[{
_id: null,
name: null,
requestcount: 255,
bookingcount: 0,
earliestrequest: '2021-04-13 10:16:54.675Z'
revenuetotal: 0,
quota: 0,
revenueperbooking: 0,
revenueperrequest: 0,
bookingFlags: []
},
{
_id: '666666',
name: null,
requestcount: 20,
bookingcount: 5,
earliestrequest: '2021-04-08 12:18:54.675Z'
revenuetotal: 6470,
quota: 0.25,
revenueperbooking: 1294,
revenueperrequest: 323.5,
bookingFlags: ['In House', 'Dynamic']
},
{
_id: '999999',
name: 'Brand International EU',
requestcount: 350,
bookingcount: 50,
earliestrequest: '2021-06-07 13:08:54.675Z'
revenuetotal: 12,650,
quota: 0.14,
revenueperbooking: 253,
revenueperrequest: 0.03,
bookingFlags: ['In House', 'Dynamic', 'In House', 'Wiederholer']
}]
And I’m trying to figure this out for 2 weeks now and I feel like I’m close but can’t seem to get the bookingFlags property to do what I want.
My current aggregation looks like this:
db.getCollection('requests').aggregate([
{
$lookup: {
from: 'bookings',
localField: '_id',
foreignField: 'requests',
as: 'bookings',
},
},
{
$lookup: {
from: 'campaigns',
localField: 'meta.campaign',
foreignField: 'id',
as: 'campaign',
},
},
{ $unwind: { path: '$campaign', preserveNullAndEmptyArrays: true } },
{ $unwind: { path: '$bookings', preserveNullAndEmptyArrays: true } },
{
$lookup: {
from: 'flags',
localField: 'bookings.flags',
foreignField: '_id',
as: 'bookings.flags',
},
},
{
$group: {
_id: '$_id',
bookings: {
$addToSet: {
$cond: {
if: { $not: '$bookings._id' },
then: '$$REMOVE',
else: '$bookings',
},
},
},
campaign: { $first: '$campaign' },
dateRequest: { $first: '$dateRequest' },
meta: { $first: '$meta' },
},
},
{
$group: {
_id: '$meta.campaign',
name: { $first: '$campaign.name' },
requestcount: { $sum: 1 },
earliestrequest: { $min: '$dateRequest' },
revenuetotal: { $sum: { $sum: '$bookings.revenue' } },
bookingcount: { $sum: { $size: '$bookings' } },
bookingFlags: { $push: '$bookings.flags' },
},
},
{
$project: {
_id: 1,
name: 1,
requestcount: 1,
earliestrequest: 1,
revenuetotal: 1,
bookingcount: 1,
quota: {
$cond: {
if: { $gte: ['$requestcount', 1] },
then: {
$divide: [{ $multiply: ['$bookingcount', 100] }, '$requestcount'],
},
else: 0,
},
},
revenueperbooking: {
$cond: {
if: { $gte: ['$bookingcount', 1] },
then: { $divide: ['$revenuetotal', '$bookingcount'] },
else: 0,
},
},
revenueperrequest: {
$cond: {
if: { $gte: ['$requestcount', 1] },
then: { $divide: ['$revenuetotal', '$requestcount'] },
else: 0,
},
},
bookingFlags: {
$reduce: {
input: '$bookingFlags',
initialValue: [],
in: { $concatArrays: ['$$value', '$$this'] },
},
},
},
},
]);
And the result looks like this:
[
{
"_id" : null,
"name" : null,
"requestcount" : 225.0,
"earliestrequest" : ISODate("2021-06-07T13:26:37.754Z"),
"revenuetotal" : 0,
"bookingcount" : 0,
"quota" : 0.0,
"revenueperbooking" : 0.0,
"revenueperrequest" : 0.0,
"bookingFlags" : []
},
{
_id: '99999999',
name: 'Brand International EU',
requestcount: 2870.0,
earliestrequest: ISODate('2021-01-11T23:00:00.000Z'),
revenuetotal: 16151.4,
bookingcount: 5,
quota: 0.174216027874564,
revenueperbooking: 3230.28,
revenueperrequest: 5.62766550522648,
bookingFlags: [
[
{
_id: ObjectId('616025fee5bff7b3d6d8992b'),
apikeys: [],
name: 'In House',
type: 'booking',
access: 'global',
__v: 0,
},
{
_id: ObjectId('61602615e5bff7b3d6d8992c'),
apikeys: [],
name: 'Wiederholer',
type: 'booking',
access: 'global',
__v: 0,
},
],
[
{
_id: ObjectId('616025f7e5bff7b3d6d8992a'),
apikeys: [],
name: 'Telefon',
type: 'booking',
access: 'global',
__v: 0,
},
{
_id: ObjectId('61602615e5bff7b3d6d8992c'),
apikeys: [],
name: 'Wiederholer',
type: 'booking',
access: 'global',
__v: 0,
},
],
[
{
_id: ObjectId('616025f7e5bff7b3d6d8992a'),
apikeys: [],
name: 'Telefon',
type: 'booking',
access: 'global',
__v: 0,
},
],
[
{
_id: ObjectId('616025fee5bff7b3d6d8992b'),
apikeys: [],
name: 'In House',
type: 'booking',
access: 'global',
__v: 0,
},
],
[
{
_id: ObjectId('616025fee5bff7b3d6d8992b'),
apikeys: [],
name: 'In House',
type: 'booking',
access: 'global',
__v: 0,
},
],
],
}
]
So the problem is, that bookingFlags is a nested array instead of a flat one with just the names of the flags. My aggregation pipeline is slow and surely not the best approach, but its the only thing i managed to come up with. I’m open for any suggestions, optimizations or other solutions. Honestly, if you’ve read until this part, you are a champ, and if you can help me solve this monster of an aggregation, I will buy you a beer.
I hope someone can help me before i flip my desk and burn down my office.
cheers
George