How do I model a survey app data

I am working in a web app that will be used to survey around 2 million students.
The survey will ask different range questions (A range from 1 to 5 on how much the student agrees with the statement) and will also gather data on categories like gender, age, school, state, city, time to finish, etc. Also the survey will be repeated periodically to see the changes over time.

The web app has to have a viewer where data can be visualized with different charts but also filtered by categories to see differences in ages, schools, etc.
I think it would be wise to fetch a random sample of the data because of the scale of the project. But all outliers must be fetched because the point of the app is to find struggling students.

How would you organize the data? I just cant figure it out.

Hi
Nice idea to work on
Finding the perfect schema that can scale well, perform well, and manages handily requires hit-and-trial methods which come with experience.
At the basic level, we can break this problem like

  • Generation of Survey forms
    We can have here a Category Collection and a Form Collection(which stores the QnA and category)
  • Submission of Survey by the user and Collecting Analytics
    We can have here a User Collection and Submitted Form by the user(which stores the answers and
    analytics part).
  • Showing Analytics
    Once we have data we can use charts to display it.

Hope it helps