Hi!
I’m working on data modeling for my web application. I have several abstract data layers:
- ip address
ip address contain several fields itsels, like “asnName”, “orgName” and may have more fields in future - port(s)
each ip address may contain one or several ports. Each port consists several fields like: “protocol”, “state”, “banner”, “service”. - check(s)
each port may contain one or several checks. Each check consists fields like: “result”, “description”, “name”. Result may contain up to 1MB data. (Thinking about storing it as a file)
In general, collection contains several IPs, each IP several ports and each port several checks. All of objects and fields are readable and writable.
Which models I’ve tried:
-
Deeply nested structure: all in one collection.
one document contains one ip with all data inside. ip[ports[checks]]
pros - it is comfy to retreive information about one or all IPs, for not deep nested structure (ports) I use simple queries
cons - it is hard to perforn any kind of manipulations with checks as they are deeply nested. The uniqeness of ip-port_number-port_protocol should be maintained by the application. -
Semi Denormalized structure.
collection consists of documents, each document contains: IP_address, port_number, protocol_number, checks for its port.
pros - it is easy to work with ports and checks. Working with checks more comfortable than in 1-Deeply nestes structure
cons - if I want to update IP and related data I have to delete all documents ip-port_number and insert new ones. -
Using separate collections for ips-ports and checks
ips and ports are in one collection, and checks are in another. IPs collection has structure as in 1-Deeply nested structure, except there is no check list for each port. Each check lies in each document inside checks_collection and has reference to appropriate document in ip document in ips collection by duplicating data - ip_address, port_number, port_protocol.
pros - the most comfort way to perform any kind of operations with checks
cons - the most uncomfort way to perform any kind of operation except checks
So one of the strongest feature which I want to add to my app is sorting ips and ports by result of checks. Is it checked or not.
But first, I want to decise which model structure suits me most. Possibly, I miss something. Any advice would be so much appreciated