• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Liutauras Vilda
  • Paul Clapham
  • paul wheaton
Sheriffs:
  • Tim Cooke
  • Devaka Cooray
  • Rob Spoor
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Tim Moores
  • Carey Brown
  • Mikalai Zaikin
Bartenders:

Mongodb generating duplicate _id value!

 
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
This has been bothering me for weeks now.
I have a python script that periodically queries timeseries data from a crypto api, and inserts it into my mongodb collection.

Every now and then, seemingly at random, the collection.insert_many function will fail with the error message shown below. I formatted the json so it's easier to read.
The id field is not generated by me and is supposed to be created and managed by Mongo. So how can I prevent the collision on a field I don't provide?

 
Ranch Hand
Posts: 202
1
Android Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Just a guess: the timestamp of two entries 'op' that are in epoch (=seconds since 1.1.1970 00:00) is the same and causes the error since timestamp is defined "unique:true".

Please provide the schema of your op records! This would shed more light on the issue and allow to reduce the amount of guesswork.
 
Karanveer Singh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I do have a unique timestamp index, but before I insert, I have the following python logic to filter only those timestamps which are not present in the collection.



This is a sample document.



Is there some other way for me to filter timestamps to those not present in the db?
 
Roland Mueller
Ranch Hand
Posts: 202
1
Android Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Isn't it possible that the data that will be inserted (df.to_dict("records")) already contains duplicates in terms of "timestamp"?
 
Karanveer Singh
Greenhorn
Posts: 14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Oh. I hadn't considered that.

I've now updated to

 
Rancher
Posts: 516
15
Notepad Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello. Just some thoughts on the error and its handling.

"errmsg":"E11000 duplicate key error collection: crypto.btc_ohlc_minute index: _id_ dup key: { _id: ObjectId('6255cce8bdfc280a8c8d6ee8') }"...



This error says that the unique index constraint on the _id field is violated. Every MongoDB collection document has this mandatory field, which functions as a "primary key", is of type ObjectId and is created by default. This value can be user supplied, optionally.

When doing a bulk insert (as using the insertMany method), when any error occurs the operation is aborted - as in this case. But, there is an option to continue with the bulk insert even when there is an error. And you can try to use it. The insertMany has an optional parameter ordered, and is by default True - the process aborts when there is an error. When set to False, the insert operation continues even when there is an error. This is useful when inserting many documents and do not need to stop the process in case of an error, like an "E11000". This kind of situation is quite common in MongoDB bulk operations.
 
So you made a portal in time and started grabbing people. This tiny ad thinks that's rude:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic