I have a few 7GB csv that I want to import into MongoDB using mongoimport. The format roughly looks like this with close to 100M records.
tel_no, name, station_id
‘92564518’, ‘Timmy’, 3
I have used --fields to specify the types of each column, but the following error arises: Failed: type coercion failure in document #0 for column ‘station_id’, could not parse token ‘station_id’ to type int32. I couldn’t directly use --headerline either because the automatic type wouldn’t be what I want.
So Ideally, I want to skip the first line of the csv (the headerline) or modify it into tel_no.string() format. However, I don’t want to create extra csv that only differs from the original one by the headerline due to the space constraint of the server. I’m wondering how I can do the above task considering the size of data and the requirement on the memory?