I’ve got an M0 instance preforming a query sorted by an index and returning about 40kb of data. I’ve checked the query plan and the index is used and the execution time is ~30ms. Returning 40kb of data should also take ~30ms on my internet connection, so I can’t understand why the time from the call to return the data takes 624ms (average of 10 runs)
I then tried running locally and the time improved to ~40ms.
I migrated the collection to PostgreSQL and tried running that locally on the same data with the same index and the time was ~2ms. I then tried migrating to a PostgreSQL AWS RDS instance, minimum spec, and the execution time was 95ms.
Does anyone know why MongoDB should be so much slower than PostgreSQL? Is it likely that I’m doing something wrong, or is this just the way it is.
The documents in the collection are of a moderate size and contain large embeddings, but I’m using projections so that not all data is being returned and the same data is being returned by both Postgres and Mongo.
When using a projection to return selected fields, does MongoDB parse the whole document to find the selected fields, or does it have a map of the fields so it only has to parse those that have been selected?
I can’t really understand what else could be happening here.
A projection only affects what is sent in the response.
Documents are not parsed per se since they are stored in BSON. Documents are stored as a whole, when it is read the whole document is read since it is stored compressed. When a document is updated the whole document is written back since it is stored compressed.
You may avoid reading a document if it is already in cache or if the index covers the query. Having an index used by a query does not necessary means that documents are not fetched.
Note that M0 is free and shared so please do not use that to compare performances with anything else.
Some systems are better than others on some use-cases. SQL may or may not be the best for your use-case as it is also true for MongoDB. One thing is sure is that if you blindly implement the same table schema in MongoDB as you do in SQL, you will not be able to reach the benefits of MongoDB.
Thanks for the comment. I understand the M0 performance is difficult to compare, which is why I also ran both locally where PostgreSQL was 20x faster.
One for the main reasons I was looking and MongoDB was because I believed read times would be better than Postgres. Admittedly, I did just dump the data into Postgres un-normalised and if I was joining tables, that would slow things down, but I’m trying to understand the performance difference with an apples to apples comparison.
I don’t really want o migrate everything to Postgres at this point, so if I could get to the bottom of this before we start scaling up, this that would be great, but I can’t see where the issue lies as it appears to be between execution, which is adequately fast, and me receiving the data, and I can’t see a way of profiling that in detail.
When I want the best performance, I do not do apple to apple comparison. A system is a series of trade-offs. So one might be good for apples and the other one with oranges. When you are really concerned about choosing the better system for your use-cases, the first step is to clearly specifies your use-cases. Then, see how to best implement the use-cases in the different platforms you are considering. Comparing the most simplistic implementations on different platforms is doing less than half of the real work. Modelling is probably more important in mongo, so if you do not spend the effort to model correctly, you might discard mongo for the wrong reasons. See antipatterns to help understand why the monolithic document is not the best.
I would be lying if I write that I do the above. Because, I do not care about SQL anymore. I did my share of it. I have more fun with mongo. My code is simpler.