I have a Go program with a thousand goroutines that each run once per hour, reading and writing a bit of data from MongoDB.
I noticed the goroutine count was increasing by about 150 per day, and the MongoDB connection count was also steadily increasing over time.
Using the pprof tools I got a summary of the goroutines:
http://<host>/debug/pprof/goroutine?debug=1
There are six groups of goroutines that keep increasing in number, all related to the MongoDB driver:
36 @ 0x43b1d6 0x4685dd 0x4685bd 0x473a4c 0xb2814b 0xb27ddb 0x46c4a1
# 0x4685bc sync.runtime_notifyListWait+0x11c /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/runtime/sema.go:513
# 0x473a4b sync.(*Cond).Wait+0x8b /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/sync/cond.go:56
# 0xb2814a go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections.func2+0xea /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:616
# 0xb27dda go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections+0x13a /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:643
18 @ 0x43b1d6 0x433b93 0x466c09 0x4d0772 0x4d1ada 0x4d1ac8 0x52b2a9 0x53c085 0x61833d 0x4f78d8 0x618525 0x615b92 0x61b42f 0x61b430 0x49819a 0xb1cfe5 0xb1cfb0 0xb1c9a7 0xb1d9a8 0xa9357e 0xa99ce5 0xac2785 0xb2ec51 0xb2d6e5 0x46c4a1
# 0x466c08 internal/poll.runtime_pollWait+0x88 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/runtime/netpoll.go:234
# 0x4d0771 internal/poll.(*pollDesc).wait+0x31 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/internal/poll/fd_poll_runtime.go:84
# 0x4d1ad9 internal/poll.(*pollDesc).waitRead+0x259 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/internal/poll/fd_poll_runtime.go:89
# 0x4d1ac7 internal/poll.(*FD).Read+0x247 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/internal/poll/fd_unix.go:167
# 0x52b2a8 net.(*netFD).Read+0x28 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/net/fd_posix.go:56
# 0x53c084 net.(*conn).Read+0x44 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/net/net.go:183
# 0x61833c crypto/tls.(*atLeastReader).Read+0x3c /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/crypto/tls/conn.go:777
# 0x4f78d7 bytes.(*Buffer).ReadFrom+0x97 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/bytes/buffer.go:204
# 0x618524 crypto/tls.(*Conn).readFromUntil+0xe4 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/crypto/tls/conn.go:799
# 0x615b91 crypto/tls.(*Conn).readRecordOrCCS+0x111 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/crypto/tls/conn.go:606
# 0x61b42e crypto/tls.(*Conn).readRecord+0x16e /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/crypto/tls/conn.go:574
# 0x61b42f crypto/tls.(*Conn).Read+0x16f /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/crypto/tls/conn.go:1277
# 0x498199 io.ReadAtLeast+0x99 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/io/io.go:328
# 0xb1cfe4 io.ReadFull+0x224 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/io/io.go:347
# 0xb1cfaf go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).read+0x1ef /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:456
# 0xb1c9a6 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).readWireMessage+0x206 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:417
# 0xb1d9a7 go.mongodb.org/mongo-driver/x/mongo/driver/topology.initConnection.ReadWireMessage+0x27 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:584
# 0xa9357d go.mongodb.org/mongo-driver/x/mongo/driver.Operation.readWireMessage+0x7d /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation.go:693
# 0xa99ce4 go.mongodb.org/mongo-driver/x/mongo/driver.Operation.ExecuteExhaust+0xe4 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation_exhaust.go:22
# 0xac2784 go.mongodb.org/mongo-driver/x/mongo/driver/operation.(*Hello).StreamResponse+0x184 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation/hello.go:200
# 0xb2ec50 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).check+0x4d0 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/server.go:709
# 0xb2d6e4 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).update+0x344 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/server.go:522
18 @ 0x43b1d6 0x44af72 0xa372fe 0x46c4a1
# 0xa372fd go.mongodb.org/mongo-driver/internal.(*CancellationListener).Listen+0x7d /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/internal/cancellation_listener.go:30
18 @ 0x43b1d6 0x44af72 0xb288a6 0x46c4a1
# 0xb288a5 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).maintain+0x285 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:750
18 @ 0x43b1d6 0x44af72 0xb2a5f5 0x46c4a1
# 0xb2a5f4 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*rttMonitor).start+0x174 /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/rtt_monitor.go:95
6 @ 0x43b1d6 0x40808c 0x407ab8 0xb34950 0x46c4a1
# 0xb3494f go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Topology).pollSRVRecords+0x60f /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/topology.go:555
I tried the mongo-driver versions 1.8.3 and 1.8.4 and they both have this behavior.
Given that all these stacktraces are for the driver packages, I’m not sure how to debug it.
Could this indicate a bug in the connection pooling?
Maybe it’s your code, @Jim_Slattery ?
I’m not an expert in the Go driver (though I have used it), but I’m looking at the connection example and I see that won’t disconnect until main()
exits. Do your main()
routines exit and leave hung connections, or are the connections opened actually hanging around programmatically?
Matt_Dale
(Matt Dale)
April 4, 2022, 4:51pm
3
Most of the code in the stack traces you posted is expected to run 1 to multiple goroutine per connected server, so having many of those running is expected (depending on how many servers are in your MongoDB deployment). However, having 6 goroutines for pollSRVRecords
seems unusual:
6 @ 0x43b1d6 0x40808c 0x407ab8 0xb34950 0x46c4a1
# 0xb3494f go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Topology).pollSRVRecords+0x60f /actions-runner-1467fb14/_work/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/topology.go:555
Is it possible that your code is creating or connecting new clients per run and/or not disconnecting old clients?
Ok, our MongoDB deployment only has 3 servers, so those goroutine counts do seem high.
Our code only connects to MongoDB once during startup:
client, err := mongo.Connect(ctx, clientOpts)
...
db := client.Database(db)
And then, all of the goroutines reuse that db
to make their requests, like:
db.Collection(c).FindOne(ctx, criteria, ...)
So I’m not sure if there’s something we’re doing wrong, or if it could be identified with any debugging tools?
BTW, we weren’t explicitly using pool size settings, but I did notice if I set them in this conflicting way, &minPoolSize=100&maxPoolSize=1
, the goroutine count exploded:
594 @ 0x43b1d6 0x40808c 0x407ab8 0xb28af0 0x46c4a1
# 0xb28aef go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).maintain.func4+0x2f /actions-runner-2de7bb32/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:741
12 @ 0x43b1d6 0x4685dd 0x4685bd 0x473a4c 0xb2814b 0xb27ddb 0x46c4a1
# 0x4685bc sync.runtime_notifyListWait+0x11c /actions-runner-2de7bb32/_work/_tool/go/1.17.8/x64/src/runtime/sema.go:513
# 0x473a4b sync.(*Cond).Wait+0x8b /actions-runner-2de7bb32/_work/_tool/go/1.17.8/x64/src/sync/cond.go:56
# 0xb2814a go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections.func2+0xea /actions-runner-2de7bb32/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:616
# 0xb27dda go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections+0x13a /actions-runner-2de7bb32/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:643
Matt_Dale
(Matt Dale)
April 4, 2022, 11:55pm
5
Agreed. It seems like each of the 6 topology instances is connecting to all 3 servers, resulting in 18 goroutines for the rtt_monitor.go
, pool.go
, and cancellation_listener.go
. For 3 servers, there should only be 3 goroutines for each of those functions, not 18.
I don’t see any immediately obvious problems in those code snippets. You’re only connecting a Client
once at startup, which seems correct. Using db.Collection(c).FindOne(...)
also seems correct because all Database
and Collection
instances should use the same Client
resources.
I noticed the directories from your stack traces frequently start with /actions-runner
; are you running these tasks with Github Actions or some other system with “variable persistence” (i.e. may create a new runtime, may reuse an existing one) like AWS Lambda? I’m not immediately familiar with Github Actions or other “variable persistence” environments, but I wonder if it’s possible that some Go driver resources are unexpectedly re-initialized sometimes.
If you dump a goroutine summary using a pre-v1.8.0 version of the Go driver, do you also notice multiple goroutines for call topology.(*Topology).pollSRVRecords
, or only one?
Concerning:
Jim_Slattery:
BTW, we weren’t explicitly using pool size settings, but I did notice if I set them in this conflicting way, &minPoolSize=100&maxPoolSize=1
, the goroutine count exploded:
That actually sounds like a bug. I’ve filed GODRIVER-2366 to investigate and fix it.
Thanks Matt, I’ll have to try out version 1.7.5 of the driver for comparison.
Regarding /actions-runner
in the stack traces, you are correct GitHub Actions was involved, but that was only to compile the binary .
That binary is later deployed on a standard, long-running server.
BTW, I had tried setting maxPoolSize=5
but it didn’t seem to help, the goroutines have still been increasing by about 150 per day.
Mongo goroutine summary:
84 @ 0x43b1d6 0x4685dd 0x4685bd 0x473a4c 0xb2814b 0xb27ddb 0x46c4a1
# 0x4685bc sync.runtime_notifyListWait+0x11c /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/runtime/sema.go:513
# 0x473a4b sync.(*Cond).Wait+0x8b /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/sync/cond.go:56
# 0xb2814a go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections.func2+0xea /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:616
# 0xb27dda go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).createConnections+0x13a /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:643
42 @ 0x43b1d6 0x433b93 0x466c09 0x4d0772 0x4d1ada 0x4d1ac8 0x52b2a9 0x53c085 0x61833d 0x4f78d8 0x618525 0x615b92 0x61b42f 0x61b430 0x49819a 0xb1cfe5 0xb1cfb0 0xb1c9a7 0xb1d9a8 0xa9357e 0xa99ce5 0xac2785 0xb2ec51 0xb2d6e5 0x46c4a1
# 0x466c08 internal/poll.runtime_pollWait+0x88 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/runtime/netpoll.go:234
# ...
# 0xb1cfe4 io.ReadFull+0x224 /actions-runner-1467fb14/_work/_tool/go/1.17.8/x64/src/io/io.go:347
# 0xb1cfaf go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).read+0x1ef /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:456
# 0xb1c9a6 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).readWireMessage+0x206 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:417
# 0xb1d9a7 go.mongodb.org/mongo-driver/x/mongo/driver/topology.initConnection.ReadWireMessage+0x27 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/connection.go:584
# 0xa9357d go.mongodb.org/mongo-driver/x/mongo/driver.Operation.readWireMessage+0x7d /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation.go:693
# 0xa99ce4 go.mongodb.org/mongo-driver/x/mongo/driver.Operation.ExecuteExhaust+0xe4 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation_exhaust.go:22
# 0xac2784 go.mongodb.org/mongo-driver/x/mongo/driver/operation.(*Hello).StreamResponse+0x184 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/operation/hello.go:200
# 0xb2ec50 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).check+0x4d0 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/server.go:709
# 0xb2d6e4 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).update+0x344 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/server.go:522
42 @ 0x43b1d6 0x44af72 0xa372fe 0x46c4a1
# 0xa372fd go.mongodb.org/mongo-driver/internal.(*CancellationListener).Listen+0x7d /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/internal/cancellation_listener.go:30
42 @ 0x43b1d6 0x44af72 0xb288a6 0x46c4a1
# 0xb288a5 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).maintain+0x285 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/pool.go:750
42 @ 0x43b1d6 0x44af72 0xb2a5f5 0x46c4a1
# 0xb2a5f4 go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*rttMonitor).start+0x174 /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/rtt_monitor.go:95
14 @ 0x43b1d6 0x40808c 0x407ab8 0xb34950 0x46c4a1
# 0xb3494f go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Topology).pollSRVRecords+0x60f /actions-runner-1467fb14/_work/optima-recommendations/optima-recommendations/go/pkg/mod/go.mongodb.org/mongo-driver@v1.8.4/x/mongo/driver/topology/topology.go:555
I’m embarrassed to say I found another component in our codebase that was creating Mongo connections.
It was doing a health check of Mongo by periodically calling mongo.Connect
, doing a ping, and then returning. Maybe it should have called client.Disconnect
before returning, although the ctx
it had used for the connection did get cancelled.
I just rewrote that health check to only call mongo.Connect
once at startup, and reuse that client for the life of the process.
So far, it’s been 14 hours and I haven’t seen any increase in goroutines!
Thanks @Matt_Dale and @Jack_Woehr for your help!
Maybe there is some Go debugging tool that could have described the goroutines stacktraces beyond where they were spawned, it could have directly identified the source of this problem.
Matt_Dale
(Matt Dale)
April 7, 2022, 4:18pm
8
@Jim_Slattery glad to hear you found the source of the extra goroutines! You’re correct that you have to call client.Disconnect
to close all of the resources created by mongo.Connect
(including the background goroutines).
system
(system)
Closed
April 12, 2022, 4:18pm
9
This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.