How to create a database using pdf files

I am using compass and tried to upload json-converted pdf files,but that fails ‘matrix not supported?’

I also tried to import pdf directly into the collection but that also fails. My files are approx 1 MB each.

How shall i move forward?

What are json-converted pdf file? Would you be able to share an example?

I started with a pdf file and the json file based on it. The translation was made using an online program. Unfortunately I am not allowed to upliad files

I think your are on your own.

We have no clue on how a PDF (a portable file format to represent text and other type of printable stuff) and JSON (an structural way to represent data) can be related. Unless of course your PDF files represent data in some sort of matrix. But we do not now. I see a few way for you to get out of this mess.

  1. You share the PDF or JSON but it seems that you won’t
  2. Store your PDF with GridFS and call it the day.
  3. Change your PDF to JSON converter to something that does not the unsupported matrix. There is no such thing as matrix in JSON. So the matrix not supported is very well called for.
  4. Hire some contractual and have them sign NDA so that you can share your files. Prepare yourself to pay.

there is nothing confidential in my files, they are press reports. I could not upload because this system does not allow me to do so

Sorry I misinterpreted the I am not allowed to upload files.

Could you post a link to the the files?

here are the links to the source pdf file and the json translation using an online pdf2jason program

The json file looks good at first sight so I did some experimentation.

I can load the file fine with firefox and there is no error generated.

If I cut-n-paste the whole and use ADD DATAInsert Document in Compass it works.

But, if I save the same cut-n-paste into a file and try ADD DATAImport File it does not work. But the error message is different from yours. What I get is

Operation passed in cannot be an Array

So I tried what would be completely counter-intuitive. I made the whole document an array by inserting, before the first character, an opening square bracket [ and appending, after the last character, a closing square bracket ]. I was able to Import File after that change.

Thank you so much. Apology for asking this type of questions:

db.coll0.find({"pdfDoc":"Sam"})

returns Symbol ‘db’ is undefined

I have no clue.

Never had this issue. Post a screenshot of where you run this command. May be you do it in the wrong place.

Note that the query

will not return any document.

here is a status report with the issues I spoke about

As I suspected

In Compass, on the FILTER line the query is simply {"pdfDoc":"Sam"}. The db.coll0.find() syntax is for mongosh and nodejs.

If you are going to make a presentation or write some kind of papers about mongodb, I strongly suggest that you follow a few courses from univesity.mongodb.com so that what you communicate is what you know rather than the answers you got here.