Skipping header when using mongodbimport

Timmy_Hsu · March 10, 2022, 8:37am

I have a few 7GB csv that I want to import into MongoDB using mongoimport. The format roughly looks like this with close to 100M records.

tel_no, name, station_id
‘92564518’, ‘Timmy’, 3

I have used --fields to specify the types of each column, but the following error arises: Failed: type coercion failure in document #0 for column ‘station_id’, could not parse token ‘station_id’ to type int32. I couldn’t directly use --headerline either because the automatic type wouldn’t be what I want.

So Ideally, I want to skip the first line of the csv (the headerline) or modify it into tel_no.string() format. However, I don’t want to create extra csv that only differs from the original one by the headerline due to the space constraint of the server. I’m wondering how I can do the above task considering the size of data and the requirement on the memory?

Ramachandra_Tummala · March 11, 2022, 2:18am

Check these links.May help

https://jira.mongodb.org/browse/TOOLS-1384