Storing a large JSON file containing Textual information with mulitple keys in MongoDB

Hello Team,
I have a very large JSON file which i have to store in MongoDB. But when i tried to push that JSON file it threw me an “Document Too Large Error”.

After few research i found that GridFS helps us in storing a large JSON documents which are more than 16MB in Binary Format. But no where it’s given how to maintain the data in chunks with multiple fields in it. The data is stored in Binary Format everywhere without the multiple fields that is in the JSON file.

Please do help me understand how to store the below mentioned structure of the data in chunks using GridFS.

Tools Used: pymongo, Mongo version 6

Structure of data:
{ title : “Sample Title”,
data : [“text1”,“text2”,“text3”…],
link : “Sample href link” }

Thank You.

It is not clear

  1. if your large file is a single JSON document
    or
  2. if it is multiple documents that are not clearly delimited and all documents are bunch into a single array
    or
  3. if you are trying to push it into MongoDB the wrong way.

So share the command you are using. The text version of the command is nice to have so that we can cut-n-paste and a screenshot of the environment where you run the command is also pertinent as the context might reveal a few issues.

You shared the structure of the data but it is not clear how many things have the field title. Is there only 1 title in the whole file? Sharing the document by providing a link to it might help us help you.

If you have JSON, using GridFS will definitively be the last and worst solution. You won’t be able to use normale queries, you won’t be able to use aggregation.

Hello Steeve,
I have multiple JSON documents which have multiple fields in it. The JSON document has many dictionaries in it with the multiple fields Currently I’m pushing the data using bulk_write() command.

requesting = []
requesting.append(data[i])     -> where data[i] is a dictionary
result = mycollection.bulk_write(requesting)  

The structure of each dictionary is as given above. The JSON file has this multiple dictionaries.