Hello @Kim_Kern ,
Welcome to The MongoDB Community Forums! 
The design of your schema depends on your requirements and how comfortable are you to handle a complex system. As you have discussed some scenarios, I would recommend you to check the performance of your specific use-cases as per the requirements. You are right on the separate collection for both as well, I have seen many use-cases where people try to avoid duplication in their data. These are mostly for applications where you have a document that will never change, for example if we have an address collection and another collection of people, multiple people can stay at the same address so instead of writing the address every time some new documents of people are populated, they generally try to refer to the id of the address from another collection. Though it might increase the overhead as you need to use $lookup in your query but if you think that this will help you in some way in your use-case, you can design your application schema like this. Overall it totally depends on your requirements and your understanding of your application.
Below are some links that I think will be helpful for you with regards to your schema design. Please take a look at these.
One way to check the performance of your system is to check the output of your query with explain in executionStats
mode (e.g. `db.collection.explain(‘executionStats’).aggregate(…)). Let’s check this with an example
This is the document and index I added as first example and the execution stats of query db.collection1.find({ _id: ObjectId("643ce9212c3e1d50dd8c493c"), "owners.ownerId": 123 }).explain("executionStats")
--> Document
{
"_id": {
"$oid": "643ce9212c3e1d50dd8c493c"
},
"breed": "Pitbull",
"owners": [
{
"ownerId": 123,
"walkCount": 5
},
{
"ownerId": 456,
"walkCount": 2
}
]
}
-->Index
[
{ v: 2, key: { _id: 1 }, name: '_id_' },
{
v: 2,
key: { _id: 1, 'owners.ownerId': 1 },
name: '_id_1_owners.ownerId_1',
unique: true
}
]
--> Output
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
Below are the documents and index I added as second example and the execution stats of query db.collection2.find({ "dogId": ("abc"), "ownerId": (123) }).explain("executionStats")
--> Documents
{
"_id": {
"$oid": "643cebfd2c3e1d50dd8c493f"
},
"dogId": "abc",
"breed": "Pitbull",
"ownerId": 123,
"walkCount": 5
},
{
"_id": {
"$oid": "643cec642c3e1d50dd8c4940"
},
"dogId": "abc",
"breed": "Pitbull",
"ownerId": 456,
"walkCount": 2
}
-->Index
[
{ v: 2, key: { _id: 1 }, name: '_id_' },
{
v: 2,
key: { ownerId: 1, dogId: 1 },
name: 'ownerId_1_dogId_1',
unique: true
}
]
--> Output
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
As you can see in both the cases the execution time is pretty quick as our query was using the respective index and hence it was able to provide results without any overhead. We can also deduce that on small databases, we can have any type of schema and it will hardly matter, it will matter when your database will grow in size, whether your application will be write intensive/read intensive, resources available(RAM, Memory etc…), Indexes used (Efficient indexes improves the performance significantly) and others.
Note: This is just an example and users should test and decide their application schemas and performance based on their own requirements/testing. For professional help, you can Get in touch with our team and they could help you with any related questions.
Additionally, in case you put ObjectId as ObjectId(“abc”) or ObjectId(“123”), you will receive below error
BSONError: Argument passed in must be a string of 12 bytes or a string of 24 hex characters or an integer.
For more information regarding this, please refer ObjectId documentation
To learn more about performance, please check below resource
Regards,
Tarun