Time-series modelling: storage limitations


Currently, we store weather sensor data in two collections:

  1. sensor_data collection, where each document has the fields:
  • sensor_id: ObjectID
  • timestamp: datetime
  • temperature: float
  • wind_speed: float
  1. sensor_features collection, where information about the sensors are stored:
  • sensor_code: string
  • latitude: float
  • longitude: float

I think maybe this data could fit better in a time-series collection. However, I have the following concerns, due to our limited storage resources:

  1. If I store the sensor characteristics (sensor_features) in the time-series metadata field, would I lose storage space due to duplication? MongoDB keeps reference internally to this metadata (instead of repeating it for several data-points) ?

  2. How hard is to update this metadata or add new fields, compared to a normal collection? We may need it once every month.

  3. The only way to remove old data is by TTL (automatic removal)?