Hi,
I’m using MongoDB 4.x and the PHP 8.1 library/extension.
I’m building a small customer feedback SaaS where the entries are stored in MongoDB.
The problem that I am facing is that I have to query pretty much every key at some point making it very hard deciding on structure and indexes.
This is wat I have so far:
{
"_id": {
"$oid": "62af099d90dcb82b1f750b12"
},
"survey_id": 205380,
"account_id": 1005,
"ws": 15,
"date_reg": {
"$date": "2022-06-19T11:33:49.000Z"
},
"time_reg": "11:33:49",
"read": 0,
"ip": "x",
"ent": {
"person": "Steve",
"location": "California",
"date": "2022-06-01"
},
"topics": ["terrible", "quality", "bug"],
"tags": [10091, 15335, 235235, 23235],
"sentiment": 1,
"country": "",
"star": 0,
"ref": "",
"page": "x",
"meta": {
"os": 1,
"dev": 0
},
"answers": [{
"answer": "sample answer 1",
"original_answer": "demo antwoord 1",
"type": 0,
"qid": 5768,
"datatype": 0,
"so": 0,
"q": "Let's get started! What is your first name?"
}, {
"answer": 2,
"original_answer": "",
"type": 4,
"qid": 5770,
"datatype": 16,
"so": 1,
"q": "Thanks Let's get started..., how likely are you to..."
},
etc]
}
account_id + survey_id will be in pretty much every query.
All the other keys are keys that I have to query specifcally or multiple/all at once (when customer is selecting filters).
the answers array contain all the answers, also here is pretty much every key important except the original_answer and so.
Breakdown of what each key does:
-
answer: is the answer that is given.
It’s function: view the answer that was given and search queries -
original_answer: is the answer in the native language of the respondent.
my customer can choose to auto translate the answer to their language.
It’s function: search queries -
type: is the question type. E.g. textfield is 0, dropdown=1, textarea=2, checkbox=3 etc.
function: for use in summaries/reporting, to quickly select by question type.
Come to think of it, it could be less important because I can also query by question_id -
qid: the question id
function: for use in summaries/reporting, to quickly select entries for that specific question. -
datatype: a integer for storing what kind of data it is. eg. multiple choice, sensitive data etc.
function: used to quickly determine which entries have sensitive data. -
so: sortorder, only used in viewing feedback questions in the right order.
-
q: the question that the respondent was answering.
function: If my customer changes the question in the (MySQL) surveys table, I can tell them the results might be skewed because they changed the question when there was already live feedback.
On the other hand, if the original question was deleted, I can still show a small snippet so even if it’s removed they can still see what the question was about.
As for indexes. I have a index on almost every field, but this is of course a no go.
I would make account_id + survey_id a compound index, but then when you select multiple keys it will not use that index or at least not effectively.
Please help.