Many to many relationship and linked table/collection

Hi there,

I’m new to NoSQL databases and MongoDB and have some troubles understanding some core schema-design principles, despite the very well structured MongoDB’s docs and similar answers all around the i-net.

In SQL world, there are Linked tables, mostly used as many-to-many references between other tables, in MongoDB, I didn’t find any doc entry or tutorial equivalent to them. The only many-to-many solution I found is to link each connected tables to each other via DBRefs:
So, for example if I have Students and Courses, I will do(using mongoose):

const StudentSchema = new Schema({
  name: String,
  courses: [{ type: mongoose.Schema.Types.ObjectId, ref: 'Course' }]
});

const CoursesSchema = new Schema({
  name: String,
  studends: [{ type: mongoose.Schema.Types.ObjectId, ref: 'Student' }]
});

And for me it’s a very bad schema-design, because it’s not scalable for the reason the array of students and courses can grow very large(student-courses was just an example). So from scalability perspective, it’s bad.

Are there any alternatives to that?
What if I’ll use a linked tables (collection) to connect between the two, how should I implement it on the application level: should I first make a call to the linked collection, await it and only then perform another query? Can I do it in one go(make one request to the DB server), because awaiting for the first response on the application level can take time?

Hopefully, you could assist with the above concerns of mine.
Thanks

Hello @John_D1, welcome to the MongoDB Community forum!

I have some information for you and hope it is useful. I will assume there are a few thousand students and few hundred courses, and I could model something like this:

student:

{
  _id: <number>,
 name: : <string>,
 otherDetails: { ... },
 courses: [
      { courseId:  <number>, courseName: <string> },
      { courseId:  <number>, courseName: <string> },
      ...
  ]
} 

course:

{
  _id: <number>,
  name: <string>,
  description: <string>,
  otherDetails: { ... }
}

Now, your application has some queries. To start with some queries I can think about is, get all students in a particular course and get all courses for a particular student. These are simple queries.

To get all courses for a specific student, the query would be:

db.students.find( { name: "John Doe" }, { courses: 1, name: 1 } )

To get all students enrolled for a specific course, your query can be like this:

db.students.find( { "courses.courseName": "Database Design" } )

Note that the data related to the course is duplicated - the course name and id are present in both collections. You can also note that these details will not change often (or may not change at all). This is feasible, though you have the same data in both collections. And, this also makes querying simpler.

Some important aspects that influence database design are the amount of data in each entity, the size of each document, the relationships between entities, the application functionality (note that all this is part of an app, and the app has various functions, like displaying certain data in browser window, for example), and the other queries you write for the application (the CRUD operations).

The “linked table” is not needed in the MongoDB data design - which allows denormalized data through embedding or referencing. Based upon the kind of queries you write you can have the course referenced within the student or vice-versa.

In the above student entity, the courses field is an array of { courseId: <number>, courseName: <string> } embedded documents. You can also include more course details in this embedded document, or just the courseId field. If you have just the courseId in the courses array field, then you will have to use an Aggregation Query with $lookup to get more details about the courses.

Thanks @Prasad_Saya for such a detailed and quick reply.

The thing is, Courses and Students was just an example and the point was to properly address the issue of many-to-many relationship between two(or more) collections in large, scalable systems.

Just imagine the amount of courses per student, but also students per courses can be very, very high. Holding such a huge data in each document will lead inevitably to serious performance issues.

In SQL world, this is pretty common scenario and you can make a linked table and query whatever you need in one go.

MongoDB made for scale, as far as I read, I honestly can’t believe the very common scenario isn’t address properly yet. Or perhaps I’m mistaken? Please prove me wrong.

1 Like

Hello @John_D1, you need to provide some realistic data and application to discuss what and how this can be designed. Real data and real applications have many attributes and relationships and requirements - that is where the data modeling as a process can help.

@Prasad_Saya, say you have Products and Customers. There are millions of Products and there’s millions of Customers.

@John_D1, it is a very broad and vague description of a problem. What is it you are doing (or want to do) with those products and customers? What is the nature of their attributes?

Designing, again, requires a lot of information; thorough investigation of various aspects. I think you can start with information I had already posted, and build on that. All kinds of “large amounts” have some solution - but they are not on their own - lot of “other factors” influence the design in such cases.

1 Like

A post was split to a new topic: Is there any method for modifications like “OnCascade” in SQL?

@John_D1 Have you found any solution for this issue?

The document increased dramatically. Any other way to create pivit table and access the data in a single query?

Let me give an example.
Let’s say there are social media posts and let there be users.
Now each user can like tons of posts and each post can be liked by thousands of users.
So how am I supposed to model this many to many relationship in mongodb ?