Is the performance of $lookup still 130 times worse than Postgres?

Ben_Gibbons · October 5, 2022, 4:43pm

Hello, I have a scenario where $lookup would be preferential to embedding and came across this article in researching its efficacy. https://www.enterprisedb.com/blog/comparison-joins-mongodb-vs-postgresql It states that because postgres is build with iterative substitution. as well as merge join and hash join that it is 130 times faster at joining with damning timespans for queries to run. My question is simple, is this still the case given it being 2 years old? I am aware of some recent improvements with $lookup including the ability to have either collection sharded now so I suppose I am hoping the performance of the stage itself has been improved.

EDIT: Set Slot-Based Query Execution seems to promise some performance improvement for $lookup but does it make a dent in the 130 times?

MBee · September 29, 2023, 3:20am

Landed here with the exact same question.
Guess that the answer is, “Yes, still” then, from the fact of the lack of replies.

John_Sewell · September 29, 2023, 10:12am

From the mongo .local London event last week there has been a lot of improvement on the $lookup operator, like orders of magnitude performance boost.
I guess at the end of the day, try it with your workload, Mongo is not an relational database…if you need a relational database than choose one of them, if this type of operation is a key app path then you may need to model the documents differently.

I believe you may need to be running V7 to take advantage of all the slot processing changes, only some operations made use of it in V6

neil-lobo · April 26, 2024, 12:25am

The problem is that Mongo chooses not to implement the same algorithms that make JOINs fast as they are in other SQL databases for seemingly no good reason. If JOINs were so against their ideology then why provide $lookup in the first place? I don’t see a reason why Mongo couldn’t implement hash joins in their storage engine, it has nothing to do with the unstructured nature of document stores.

At the end of the data, relations are abstract ideas that can be modeled in pretty much every database paradigm. It’s then up to the DBMS to implement features that leverage these ideas, and with document stores being so close to conventional row-based storage engines, it’s a reasonable question to ask why they haven’t done so.

John_Sewell · April 27, 2024, 11:08pm

Fair point but different database engine have different priorities. I’m not a DB engine engineer or know the codebase of the server well enough to comment on the technical difficulties with implement various features.
I’m sure that in preference anyone would want their software to work the best in any scenario but that’s not always an option given constraints.

It’s better than it was and sure it’ll improve but going the other way around, other database engines don’t all allow query of JSON data or indexing of it in a way that Mongo does, so you could ask why they don’t support everything that Mongo does.

I use different technologies for different projects, and you just have to see what functionality the back end delivers and what you need from it and make a choice, be it Mongo, Oracle, MSSQL, Maria, Postgress or Lotus Notes…(just kidding, don’t use Notes…)

neil-lobo · May 1, 2024, 5:48am

Could you give an example of a query that Mongo (WiredTiger) can do that a DBMS that has native JSON type support can do? From what I understand, other DBMS’s can perform the same queries that Mongo can.

In the case of $lookup, my only gripe is that it feels half-baked. If they decided to not support relations altogether by not creating $lookup in the first place then I’d respect their decision to uphold their philosophy (I’m not sure if they actually align with/have aligned with the “no relations” philosophy or if it’s just a misconception of the public). Or, if they’ve stated (semi-)publically that $lookup is supposed to be an initial implementation then I’d also be ok with that.

neil-lobo · June 6, 2024, 2:33am

Since this thread has gone stale I’ll just leave a last remake.

The answer to my question is, no!

There is nothing intrinsically different in a document store compared to an RDBMS in terms of their primitive operations and architecture. Mongo doesn’t implement a typical joining algorithm because they simply don’t want to, for reasons only known to them, which is completely ok.

RDBMS’ with native JSON support can query data like Mongo and they can index JSON fields, there is no conflict in architecture.

I’m not sure where or how these misconceptions about document stores came about but they’re here.

Raz_Buchnik · December 4, 2024, 7:28pm

We have been in Mongo local Tel Aviv, they promised to call back to us with their sales and engineering team and they never called. That’s why you won’t probably never have robust and professional lookups/joins or what ever you name it in MongoDB.
Currently it’s slow, it’s lack of features and it seems that they are lazy enough to fix.