Hi, I have an users collection and trying to create a proper model in mongodb / mongoose.
I just wonder how should I store selectable datas so that It would be more performant when indexing & querying.
For example, every user has one of the following study situation →
highschool, undergrad, grad, doctorate, working
I can define this field as String or Number like 1 for highschool, 2 for undergrad , 3 for grad, 4 for doctorate, 5 for working.
So we would end up with two situation →
User:{
“studycode”:“undergrad” or “studycode”:“2”
…
}
which would be more performant when I create an index or compound index on this field ?
I encounter similar dilemmas even with bigger fields like hobbies. I am keeping hobbies in a seperate collection and use their _id value in user documents, _id is a good chose for indexing ? wouldn’t it be faster if I assign numbers to hobbies like 0 1 2 3 4 5. . . … and put them instead of _id ? I want to be fast in terms of indexed queries on those fields.
Thanks
Number comparisons should be faster than string comparisons. Note that the number 2 inside quotes is a value of type string. This is not a value of type number. I wrote should because an indexed string will be faster than a non indexed number. In this case, your key sets is small and differs at the first letter so you should not see a lot of difference.
Numbers take less space than string. So more data fits in cache. This can have a favorable difference for numbers.
Descriptive strings are more readables.
It is a tradeoff.
You could use a StudyCode collection to map your number code to a readable string. I have done it and it help me with localisation of the UI because StudyCode will have language specific values like
{ _id:2 , en:"high school",fr:"étude secondaire"}
In a case like that you could generate your own _id as small number rather than having an ObjectId. I have done it.
Thanks for your answer, I will also have Hobbies collection where I keep hobbies , there are more than 100 hobbies and each user carry a few of them in an array . So, It is much better to give them _id values as Number when I need to query right? Because users will have them in an array like hobbies:[ hobbie1, hobbie2] so it is better to have them as number rather than _id Object ?