How to do a conditionnal sum based on a string in an array?

Hello,

I have these a database with entries looking like this:

{ "_id" : BinData(0,"gQAAACGewgAAAAAAvP+DHA=="), "statx" : { "atime" : { "sec" : NumberLong(1727861248), "nsec" : 519788713 }, "blksize" : 4096, "blocks" : NumberLong(0), "btime" : { "sec" : NumberLong(1727861248), "nsec" : 519788713 }, "ctime" : { "sec" : NumberLong(1727861267), "nsec" : 939788713 }, "dev" : { "major" : 252, "minor" : 1 }, "gid" : 0, "ino" : NumberLong(12754465), "mode" : 493, "mtime" : { "sec" : NumberLong(1727861267), "nsec" : 939788713 }, "nlink" : 2, "rdev" : { "major" : 0, "minor" : 0 }, "size" : NumberLong(33), "type" : 16384, "uid" : 0 }, "ns" : [ { "parent" : BinData(0,"gQAAAIEAQAAAAAAAVIeqAw=="), "name" : "bidule", "xattrs" : { "path" : "/bidule" } } ] }
{ "_id" : BinData(0,"gQAAAH2VHAEAAAAABleStw=="), "statx" : { "atime" : { "sec" : NumberLong(1727861260), "nsec" : 410788713 }, "blksize" : 4096, "blocks" : NumberLong(0), "btime" : { "sec" : NumberLong(1727861260), "nsec" : 410788713 }, "ctime" : { "sec" : NumberLong(1727861260), "nsec" : 410788713 }, "dev" : { "major" : 252, "minor" : 1 }, "gid" : 0, "ino" : NumberLong(18650493), "mode" : 420, "mtime" : { "sec" : NumberLong(1727861260), "nsec" : 410788713 }, "nlink" : 1, "rdev" : { "major" : 0, "minor" : 0 }, "size" : NumberLong(100), "type" : 32768, "uid" : 0 }, "ns" : [ { "parent" : BinData(0,"gQAAACGewgAAAAAAvP+DHA=="), "name" : "1", "xattrs" : { "path" : "/bidule/1" } } ] }
`

And I would like to get the sum of all `statx.size` for all entries with a `ns.xattrs.path` like `/bidule`. I managed to do it easily with a condition on `statx.ino` for instance, but I can't figure out a proper way to do it for a string in an array. More specifically,

1. I'm not exactly sure what's the proper way to checking equality between strings, as I've seen the use of `$eq`, `$strcasecmp` or simply `<key>: <value>`.
2. I don't know how to query an item in an array for aggregate requests.

I've tried these commands, but they are either malformed and I don't know the reason, or the result is invalid:
`
db.entries.aggregate([{$group: {_id: '', test: { $sum: {$filter: {input: "$ns", as: "link", cond: [{$eq: [ "$$link.name", "1"]}, '$statx.size', 0] }}}}}, { $project: { _id:0, somme: '$test'}}])
db.entries.aggregate({$group: {_id: '', test: { $sum: {$cond: [{$strcasecmp: ["$ns.name", "1"]}, '$statx.size', 0]} }}}, { $project: { _id:0, somme: '$test'}})
db.entries.aggregate({$group: {_id: '', test: { $sum: {$cond: [{'ns.name': '1'}, '$statx.size', 0]} }}}, { $project: { _id:0, somme: '$test'}}) })
`

Does anyone know how to do this ?

When you mentioned

and

it is not clear if you to match both /bidule and /bidule/1 because to me /bidule/1 is like /bidule but it is not equal.

So using the 2 documents you shared and the input string /bidule, would the answer be 33 because only /bidule is equal to /bidule, or would the answer be 133 because both /bidule and /bidule/1 are both like /bidule.

What if the input string is /bidule/1?

What if the input string is /bidul?

If would be nice if you could create a playground with your sample documents. The single line format is not the easiest to work with.

Here is the playground, sorry for the single line in the first post.

To be clear: I want to know the size of all entries in a directory, and I want to be able to do it either with exact string matching or with a “like” operator (/bi*/).
So for instance, if I want to get this information for all entries is /bidule/1, it should give me 100.
If I match it for /bidule*, /bid* or just /*, it should give me 133.

I want to know how to do this so I can apply it to other string criteria, like the name or other attributes I have.

Couple of questions/clarifications first:

1. If ns can have more than one entry but only one ns.xattrs.path matches /bidule (like /bidule and /hello), do you want to add the statx.size?

2. Wrt

Is that exact match? Prefix match? Anywhere-in-string match? And case-sensitive or insensitive?

Since it’s called “path” and everything looks file related, I’m going to presume Prefix-match and case Sensitive.

Here, first match on any element in ns array starting with /bidule, then $sum all the statx.size items, and project only the sum in the result.

db.entries.aggregate([
  {
    $match: { "ns.xattrs.path": { $regex: "^/bidule" } }
  },
  {
    $group: {
      _id: null,
      somme: { $sum: "$statx.size" }
    }
  },
  { $project: { _id: 0 } }
])

Demo on Mongo Playground

1 Like

For question 1: ns can have multiple entries matching a certain path. Since they represent a directory, the directory /bidule and all entries in it (which then have a path in /bidule/something) should match, so all their statx.size should be added.
For question 2: you’re right in your assumption, It’s prefix-match and case sensitive, but similar cases when case sensitiveness is not required may occur.

Your solution works perfectly, I was missing the match in my attempts, thanks for the help!

2 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.