I have two sharded collections within the same database, say collection_old and collection_new. Both these collections contain the same shard key, and each collection contains ~20 million documents. Now I want to migrate all the documents from collection_old to collection_new. After successful migration, I want to delete the collection_old.
Since the collection size is somewhat huge, I am unsure whether the below command will cause some performance issues and, if the insertion fails for some documents, how to get the ids for those documents so that I can fix the errors and retry later.
Could you please help me with below queries for better understanding of this migration?
Which MongoDB version?
Is collection_old still receiving inserts & updates?
Could you share the output of db.collection.stats().avgObjSize and db.collection.stats().size?
What sort of performance issues/other general issues are looking at?
How similar are collection_old & collection_new? Do they have the same shard key, indexes, document structure, etc?
I think, it will be better to check whether there are _id collisions between old & new and fix them beforehand, instead of trying to fix it after the fact? If they are very similar, and colliding _id can be avoided, perhaps a mongodump & mongorestore is the fastest way to achieve this, since you can specify the number of insertion worker.