Wait Queue is Full - Understand Connections

We are having an issue in production (replica set, version 4.2.1). We track email clicks and opened emails. When large email blasts are sent it slows down our website and users are unable to login.

We get the following error:
The wait queue for acquiring a connection to server x is full.

Here are some server stats when this happens:

db.serverStatus().connections
{
        "current" : 218,
        "available" : 50982,
        "totalCreated" : 208189,
        "active" : 186
}
db.runCommand( { "connPoolStats" : 1 } )
{
        "numClientConnections" : 0,
        "numAScopedConnections" : 0,
        "totalInUse" : 0,
        "totalAvailable" : 1,
        "totalCreated" : 59272458,
        "totalRefreshing" : 0,
      ...

There is nothing that we have configured for connections in the config file. We must have the default settings. We use the C# driver and we can see the MaxConnectionPoolSize is 100 and the WaitQueueSize is 500.

Here are my questions

  • Is there any way to see the queue size?
  • I have 100 max connections, but the server is showing 218 active connections. How is this possible?
  • It seems the solution would be to simply increase the pool size, but to what number? I wss thinking of setting it to 1000.
  • Can there really be over 50K connections available by default?

Thanks

3 Likes

This is a recurrent problem that we have not found the answer yet.
Any help would be appreciated.

Thanks

1 Like

This is relentlessly happening to one of our apps too. There’s also seemingly no pattern to it, just suddenly nothing can connect, the wait queue builds up and then it just starts throwing wait queue exceptions and can’t seem to recover itself.

I am having the same issue on my app too. Did they found a workaround for it ?

Similar problem here, did you have any success with troubleshooting it?

2 Likes

I am having the same issue on my app too. Did you have any success with troubleshooting it?

One thing we noticed in our logs was that the C# driver defaults to SHA-256, which was slowing the connection progress. Appending &authMechanism=SCRAM-SHA-1 to our connection string helped.

You can try to ensure that you using only 100 connections, by exploring

db.currentOp(true).inprog

It has client field, that contains ip addres/es.