My application is a data distributor.
For each product, we maintain data attributes. (price, inventory, setting1, setting2 etc.)
Value of these data attributes vary for different dates/date ranges.
There are multiple channels and multiple vendors for which it can vary as well.
I need to keep all history of data entries, so this table contains only insertion. No updation is allowed.
My schema looks something like this.
_id - ObejctID
productID - ObjectID (200,000 products)
channelID - ObjectID (5 channels)
vendorID - ObjectID (50 vendors)
attribute - Enum (30 attributes)
startDate - Date
endDate - Date
value - Integer
createdAt - Date
There are going to be millions of entries in this table within a year.
I want help understanding which attributes to index and how to form the composite key…
Shall I index all from productID to startDate as composite key ?
Or shall I index a few (ProductID to attribute) and leave others to be searched fully ?
Example query requirements
- Get latest value for x product, x channel x vendor and PRICE attribute for 1st and 2nd January. (This will require 2 queries. One for each day)
- Get all attribute changes for PRICE attribute for x product, x channel, x vendor for 1st January. (1 query to fetch all entries that fits the given date.)