Issue with gridfs inserts when one of hte shard is read only

We have two shards rs1 and rs2 at two locations NYC and SFO. We have a gridfs collection that is sharded based on the location. So all the data from NYC go to rs1 and all data from SFO go to rs2 shard. When the rs2 shard becomes read only because majority servers are down, even though the connection string has readPreference=secondary, the gridFS put call fails with error message

Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: “primary” } for set rs2, full error: {‘ok’: 0.0, ‘errmsg’: ‘Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: “primary” } for set rs2’, ‘code’: 133

The data that is being inserted is going to rs1, the shard that has primary node up and running. (I checked the chunks distribution for both fs.files and fs.chunks collections and also inserted the document bringing up rs2 shard and check where the document went.) But I don’t know why it is complaining.

We are using mongodb vesrion 4.0. I tried it on both C# driver and pymongo so it is not the driver issue. Seems like the put implementation has readPreference set to primary.

I tested other collections and they do allow writing when the document goes to rs1 when rs2 is readonly.

I am also planning to move to higher version to see if it has been fixed in later versions. If anyone has already used the GridFS sharding and the inserts are working when one shard is readonly please let me know.

This is a limitation in the current GridFS spec outlined here.

Instead of using “primary read preference” the find() should use the collection’s read preference (secondary in your case):

Before write operations

Immediately before the first write operation on an instance of a GridFSBucket class is attempted (and not earlier), drivers MUST:

  • determine if the files collection is empty using the primary read preference mode.
  • and if so, create the indexes described above if they do not already exist

To determine whether the files collection is empty drivers SHOULD execute the equivalent of the following shell command:

db.fs.files.findOne({}, { _id : 1 })

If no document is returned the files collection is empty.

I opened this ticket to track a fix for this problem: https://jira.mongodb.org/browse/DRIVERS-3003

Thank you for reporting it.

Thanks you for the quick response.

Do you know what is the general timeline for these kind of tickets?