Schema question from a MongoDB newbie

Don_Lair · February 8, 2020, 9:40pm

Hello!

I’m working on a Node hobby app as a way to learn and experiment with the current state of web dev. Background: I used to do a fair amount of web dev back in the LAMP stack days, but have spent the last 10+ years in business functions.

I’ve been through a couple tutorials and have a rudimentary understanding of how MongoDB works. Now it’s time to start building my own app to stretch my legs.

I’m going to build a simple stock market web app and so I’ll be gathering, recording, and displaying a lot of timeseries stock price data, as well as quarterly financial statement data.

I’ve read several threads about this and know many people recommend using an RDBMS for this, but I want to learn MongoDB so I’m using it anyhoo.

I’ve read some helpful posts on the MongoDB blog, including this one, which suggests the following schema:

const Ticker = {
  symbol: String,
  timestamp: Date,
  trading:{
    week52High: Number,
    week52Low: Number,
    percent_shorted: Number,
    shares_outstanding: Number,
    enterprise_value: Number,
    dividend: Number,
    market_cap: Number,
    price: Number
  },
  name: String,
  about: String,
  timeseries: {
    oneDay: Object,
    oneMonth: Object,
    threeMonths: Object,
    oneYear: Object,
    fiveYears: Object
  },
  quote: {
    fundamental: Object,
    price: Object,
    asset_type: String,
    time: String,
    exchange: String,
  },  
 financials: {
  fcf_over_share: Number,
  fcf: Number,
  eps_margin: Number,
  eps: Number,
  ebitda_margin: Number,
  ebitda: Number,
  revenue_growth_3_yr: Number,
  revenue: Number
 }, 
};

I’m planning to run with that as a starting point and refactor along the way if needed.

Before I do, I was wondering if anybody with more experience can anticipate issues or suggest improvements on the above before I jump in.

Thanks!
Don

Eric_Reid · February 10, 2020, 5:19pm

That’s a good start, but so much of schema design in MongoDB comes down to how the collections will be queried. Yes, you want to denormalize as much as possible. Yes, you may well make use of the Bucket Pattern for time series data. Until you actually craft the actual queries, however, you really don’t know if you’ll have contention/locking issues, I/O throughput issues, etc.

So yeah - start with what you do know, try it, monitor performance, and refactor as needed.

Oh, and MongoDB is good for just about every use case - the days of “RDBMS is better for this” are just about passed…

Don_Lair · February 10, 2020, 5:39pm

Thanks Eric, I’ll take a look at the bucket pattern