I’m designing indexes for a social media / dating app running on MongoDB.
Most queries include a gender filter (e.g., "gender": "male" or "gender": "female"), but gender only has two possible values.
From what I understand:
- Low-cardinality fields (few distinct values) don’t provide much selectivity.
- Compound indexes work best when the leading fields reduce the search space significantly.
db.users.find({
gender: "female",
city: { $in: ["Chicago", "LA", "NY"] },
intention: "travel",
age: { $gte: 25, $lte: 35 }
})
Candidate index examples:
db.users.createIndex({ gender: 1, city: 1, intention: 1, age: 1 })
// Option B
db.users.createIndex({ city: 1, intention: 1, age: 1, gender: 1 })
My questions are:
- Is it worth including
genderin a compound index at all, given it only cuts the search space in half? - If so, should
genderbe first in the index, or last (after more selective fields likecityorage)? - Are there cases where MongoDB can still efficiently use the index if
genderis always in the query but not indexed?
I’d appreciate an explanation of best practices for low-cardinality fields in compound indexes, especially in the context of queries that always include the field.