This has been bothering me for weeks now.
I have a python script that periodically queries timeseries data from a crypto api, and inserts it into my mongodb collection.
Every now and then, seemingly at random, the collection.insert_many function will fail with the error message shown below. I formatted the json so it's easier to read.
The id field is not generated by me and is supposed to be created and managed by Mongo. So how can I prevent the collision on a field I don't provide?
This error says that the unique index constraint on the _id field is violated. Every MongoDB collection document has this mandatory field, which functions as a "primary key", is of type ObjectId and is created by default. This value can be user supplied, optionally.
When doing a bulk insert (as using the insertMany method), when any error occurs the operation is aborted - as in this case. But, there is an option to continue with the bulk insert even when there is an error. And you can try to use it. The insertMany has an optional parameter ordered, and is by default True - the process aborts when there is an error. When set to False, the insert operation continues even when there is an error. This is useful when inserting many documents and do not need to stop the process in case of an error, like an "E11000". This kind of situation is quite common in MongoDB bulk operations.