A friend and I talked about building a low latency global service, using Cloudflare workers and MongoDB.
Let’s say we have a cluster on M30 with global low-latency reads configured.
If we now follow the guidelines described in this blog on setting up Ream, how is the architecture going to be?
If a user in Argentina makes a GET request, the nearest Cloudflare data center will respond and call the Realm API. But where is the Realm API physically located - is it also deployed globally?
I’m the author of that blog post.
This doc is important for my answer.
MongoDB Realm apps are deployed either globally or locally. If they are deployed globally, your app is available in Ireland, Oregon, Sydney and Virginia.
So if a client is in New York, he will likely be rooted to the app in Virginia, then depending on your configuration, it will reach to the Primary node for a write or to the closest node if you are reading with readPreferencenearest (assuming you have a Replica Set. A Sharded cluster would cover even more ground). And this without Cloudflare. Only Realm auth + Realm functions (which are equivalent to Cloudflare workers).
Now the problem with my Cloudflare blog post is that we omitted to talk about caching the MongoDB Connections. This blog post was more a proof of concept rather than a production ready code sample.
In MongoDB Realm, connections to the underlying cluster is cached. Each time a Realm function needs to connect to the MongoDB cluster, the same client connection is re-used. This avoid the handshake and the cost of creating and maintaining a connection for the replica set EACH TIME we make a query. When you run a query, you just want to run the query and access MongoDB, not initialise everything from scratch each time.
It’s a bit what’s happening with my Cloudflare code. Maybe there is a way to cache the connection with Cloudflare, but I don’t know enough about Cloudflare to do so.
It’s the same thing for AWS Lambdas. You have to cache the connection in a global variable that is reused by all the other lambdas.
Cloudflare is an extra layer in between your client and your MongoDB cluster in Atlas that isn’t necessary really. It’s an extra step in the network as well.
The best scenario would be to have a locally deployed Realm app in Dublin, Ireland and an Atlas cluster also deployed in Dublin. When you execute your Realm Function, it can access the cluster next door very fast without rooting the query around the world twice.
I don’t think that’s correct? In my perspective, the business rules, authentication, and authorization would live here. But that’s probably a separate discussion.
The way I view Cloudflare workers is the future of serverless. It’s more global and features 0ms cold-start → a significant improvement compared to serverless functions. Now the question is how to combine this with an equally distributed database. I believe this is exactly why Cloudflare announced their D1 SQL database in September 2022.
So my question could essentially be rephrased like this:
Is MongoDB+Realm a good fit for Cloudflare workers, or would it make more sense to use Cloudflare D1?
On a side note, could you share some insights as to why it’s not possible to use the node.js MongoDB driver? Would it ever be possible, or is the V8 environment never going to be compatible?
I totally agree with that. What I meant to say is that MongoDB Realm App (Atlas App Service soon - MongoDB is renaming them) is already capable of handling this serverless workload.
You can achieve the same result without Cloudflare entirely and replace the Cloudflare workers by Realm Functions (Atlas Functions soon). The difference is that Realm Functions have a built-in cache mechanism that handles the connection to the Atlas cluster for you.
With MongoDB Realm you can handle the Rules, Schemas, App Users & Auth, GraphQL API, Functions, Triggers, HTTPS Endpoints (=webhooks for REST API) and front-end hosting.
Or you could also use the Atlas Data API (== REST API) which can just be activated in one clic.
With Serverless functions (from anywhere) you JUST want to execute the actual code and remove anything that would make you waste time (like initialise a framework, initialise a connection to another server (like MongoDB…), start a JVM, etc).
With the implementation we did in the Cloudflare blog post, it works. Ok. But each call to this lambda/worker creates a brand new connection to the MongoDB Cluster (at least that’s my understanding) with it’s entire connection pool, etc. This is like a cold start and it’s also like a DDOS attack from the MDB cluster perspective given that you are not executing this only 3 times a day of course. A MongoDB cluster can only sustain a certain number of connections (for an M30 it’s 3000) and it costs memory to the cluster to open and close them. Not counting the network TCP handshakes.
Realm Functions access the MongoDB cluster like this:
const mongodb = context.services.get("mongodb-atlas");
const movies = mongodb.db("stitch").collection("movies");
This built-in context act as a cache that keeps a pool of MongoDB connections available for the serverless functions to use when they need it. No need to repeat the handshake & auth each time I want to talk to MDB.
About Cloudflare D1, you just made me aware of its existence. So I have absolutely no idea what it’s worth. I just know it’s won’t scale like MongoDB does (because it’s SQL).
I think Cloudflare workers don’t support Tier Party Libraries (NPM) entirely (Node.js compatibility · Cloudflare Workers docs) and I think the MongoDB Node.js driver isn’t supported. I would have used that for the proof of concept / blog post. But I had to use this weird workaround with the Realm Web SDK (not really proud) that is supposed to be used in a front-end (not a back-end function)… But it’s the only solution I had to get a connection with MongoDB.
@MaBeuLux88_xxx thanks for the excellent answer - possibly the best forum answer I got in a long time. My key takeaways are:
MongoDB is not a great fit for Cloudflare workers until the connection cache issue is solved
Realm Functions is better suited as an alternative to Cloudflare workers because there is a built-in connection cache.
Now that you mention Realm Functions as an alternative to Cloudflare workers, how do they compare?
What runtime do realm functions use - V8 or node?
What about cold-start issues?
Is it possible to configure API keys programmatically, using a service like Doppler?
Is it possible to forward the log, to something like Logtail/Papertrail?
Can we use Realm functions to answer with HTML responses?
Do you know of any efforts to make Realm functions work with Sveltekit (or similar frameworks?)
Thaaaanks - I know it’s a lot of answers, but reading through the documentation did not give me a clear indication if Realm functions is a direct alternative to Cloudflare/vercel/lamdba
I recently implemented a quick POC using Cloudflare Workers as the backend of a web app and had to connect it to a MongoDB Atlas cluster. Cloudflare Workers currently supports HTTP and WebSockets but not plain TCP sockets. For this reason, as @MaBeuLux88_xxx points out, the MongoDB Node driver is not supported. That being said, Cloudflare seems to be working on this limitation. Some workarounds to connect to a MongoDB cluster from Cloudflare Workers include:
Using the Realm client SDK (as explained in the blog post).
Using a database proxy (like Prisma).
When using Realm, it seems that the blog implementation does not create a new connection with each request. This post suggests that Realm manages connections to Atlas automatically, depending on the requests made by client endpoints.
Hi Folks – Tackling a few of the latest questions on this thread. Note, we have recently renames MongoDB Realm (APIs, Triggers, and Sync) to ‘Atlas App Services’ to be clearer/more differentiated from the Realm SDKs.
Functions use a custom JS runtime that most closely matches Node. It supports some features that Cloudflare workers don’t (ex. TCP connections) but not all modern JS Syntax.
Generally speaking, Functions do not have cold calling/cold start costs.
I’m not familiar with Doppler, but new API keys can be configured with the Admin API
Yes, see our Log Forwarding
5/6. I believe you’re basically getting at “Are Functions able to fully support SSR applications” – Functions are not a good fit for this, but it’s something we’re considering investing more in.
Finally @Sergio_50904 – on your connection management question – App Services essentially open connection pools between our hosts and your cluster and dynamically create/release connections based on demand. Connections can also be shared across multiple requests so you tend to open a more efficient number of connections at scale and pay the cost of opening a new connenction less frequently. This is true for all App Services (Sync/SDKs, Data API, GraphQL, Triggers).
Thanks for your excellent answers - especially the honest answers on 5 and 6 For now, I think we will stay with more proven technology. Also - it seems like Altas App Services is currently missing the possibility to run locally - something I think most developers would identify as a critical feature for development.
Thanks for the feedback Alex – We have designed our CLI to be interactive and make it easy to work alongside the cloud while developing locally or in CI/CD pipelines, but I do understand that some folks prefer a local/emulated environment for development. It is certainly another area that we’re considering!
Yeah especially with MongoDB it makes sense to be able to run locally - since MongoDB is one of the few databases that will literally run everywhere
In our code architecture, we love to integration test against MongoDB running locally in-memory. It’s fast, makes for reliable tests, costs nothing, and almost emulates the production environment (looking at full-text search here )
Planetscale today announces support for edge environments, using a
Fetch API-compatible database driver
This is exciting - becasue of the obvious question; Could MongoDB do the same thing? So we could finally have a solution for using MongoDB in combination with Cloudflare Workers/Vercel Edge/Netlify Functions.
Now that Atlas Data API is released, you can use this in Cloudflare workers to communicate with MongoDB. This is the best solution as it’s a way to reach MongoDB over HTTPS without any dependency (driver) instead of direct TCP and this solves the auth problem that we had with the SDK.
I think it’s also more simple to handle in the code so it’s another point for the Data API.
I hope you enjoyed the leave - I have 2 girls myself and it’s a lovely but busy time in the first years
Do you have some numbers regarding the performance of the Atlas Data API compared to direct TCP? Because our application is pretty data-intense and we server-side render things, it’s essential for data-fetching to be very fast.
HI Alex - while exact numbers will always vary based on a multitude of things including region, query, amount of data you have, cluster size.
Basic CRUD usage of a well optimized configuration is likely to be in the very low 100ms range (100-200ms) while drivers will perform in the 5-15ms range. Note that this is not a guarantee, but a rough estimate.
This follows the architectural model that the Data API is solving for → it is meant to replace your API and Microservice layer by behaving like that managed middleware, not help you build a backend/middleware API service.
For AWS Lambda it’s a bit different because you directly have a NodeJS runtime with NPM support so you can leverage directly the MongoDB Driver but you have to cache the connection so you don’t connect each time you run a lambda.
So if your cluster is located in a single region and you can host your Lambda in this same AWS region, there are good chances that you are going to be faster than Cloudlare & Atlas Data API.
No, MongoDB uses connection pools. So I think it’s more like a hundred connections per driver rather than one but it probably depends on the driver we are talking about as well as each have a different implementation (but follow the same specs).
Yes but it’s more like 100 connections per driver instances or so and there are also internal connections for the Replica Set, etc.
Main issue will be the latency between the server (back-end / Lambda) and the primary you are trying to reach I guess. It also depends on the Internet speed, network quality, etc. The connection itself it pretty fast, the problem is the path and worst case scenario, only the very first lambda would be slower. Once it’s initialised, it’s gonna run forever.