I have total 20m documents, and about 17m documents has Addr
key.
This key is kind of address, about 1m type of value exists.
I need to get all documents that Addr
exists and classify & sum value(each document has Price
, Type
key)
sample code of my script is like this
for docu in collection.find():
# TODO - classify by Type
and determine to add or not Price
In this case, which is faster and proper way?
-
Addr
: {$exists: True} - scan whole data and continue if
Addr
not in docu
I’m confusing which one is right way