Map Document with Text String?

Hi everyone,

I’m a MongoDB newbie who has put together his first Mongo DB. (Ubuntu platform, Mongo v5.0.8. I’m actually using the Docker container version of Mongo.)

I’m wondering if there’s a way to map a string to a database document and/or vice versa?

To explain in more detail: In my job, I have a piece of software that takes in a text string as input, processes the string, then generates output in the form of a JSON file. These JSON files can be quite diverse; no two are really alike. To analyze the output, I’ve put about a thousand of these JSON files into my Mongo DB instance.

Only now, I’m realizing that just looking at the JSON output is only half of that picture. For each document, I need the original text string associated with the JSON. (And sadly, that string is not included within the JSON itself.)

To be explicit: If I’m in Compass and I’m searching on a given input string, I need a way to pull up the corresponding JSON output document. Or, given a JSON document, I need to be able to lookup the original string. There is an exact 1:1 relationship between string and JSON; no two strings will be the same, and no two JSON documents will be the same, either. Every string will map to exactly one JSON, and vice versa.

When I uploaded my- JSON docs into Mongo, I used this mongoimport command from the Ubuntu command line:

mongoimport --db "db01" --collection "table01" --file "output01.json"
mongoimport --db "db01" --collection "table01" --file "output02.json"
mongoimport --db "db01" --collection "table01" --file "output03.json"
...etc...

Very easy. But now, I can’t manually assign each input string to its corresponding JSON output document. I’m willing to delete the current database and re-enter everything again, perhaps with something like this:

mongoimport --db "db01" --collection "table01" --file "output01.json"  --map "Input String 01"
mongoimport --db "db01" --collection "table01" --file "output02.json"  --map "Input String 02"
mongoimport --db "db01" --collection "table01" --file "output03.json"  --map "Input String 03"
...etc...

Of course, I don’t see something like that in the mongoimport documentation. Does anyone have any suggestions? I don’t mind rebuilding the database to include mapping function. Thank you.

Hi @redapplesonly and welcome in the MongoDB Community :muscle: !

Why not include the input string in the doc you are inserting into MongoDB along with the fields generated by that string?

{
  'input_string': 'abcde',
  'field1': 10,
  'field2': 'Hello There!',
  'field3': 42,
  'field4': ISODate(xxx)
}

With an index on {input_string: 1}, you could retrieve these documents easily. Also, you could use MongoDB as a cache to avoid reprocessing incoming input strings that are already in your MongoDB collection.

Cheers,
Maxime.

Thanks for the thought, Maxime! Unfortunately, I don’t have control over what goes into the JSON documents. They are static, and I have to import them as is. No editing allowed. Its a big headache, to be honest.

Ultimately, if I am sorting though my documents within MongoDB and a find() pulls up any specific document, then I need a way to MAP(document) ==> string. Conversely, if I know the string and I want to see the JSON that was its output, I need MAP(string) ==> document. You see my dilemma.

Create another collection which you can control then and reference to the other doc that you can’t touch for some obscur reasons?

Your collection (:musical_note: can’t touch this :musical_note: ):

{
  _id: ObjectId('628e5ee995973139032f704c')
  input_string: 'abcde'
  related_doc: ObjectId('628e5ee995973139032f704d')
}

JSON collection:

{
  '_id': ObjectId('628e5ee995973139032f704d')
  'field1': 10,
  'field2': 'Hello There!',
  'field3': 42,
  'field4': ISODate(xxx)
}

You’ll have to use a $lookup now but if it’s the only way…

Cheers,
Maxime.

Interesting, thank you! I will implement and report back…!

1 Like