How to store 7.3 billion rows of market data (optimized to be read)?

后端 未结 13 617
故里飘歌
故里飘歌 2020-12-12 09:53

I have a dataset of 1 minute data of 1000 stocks since 1998, that total around (2012-1998)*(365*24*60)*1000 = 7.3 Billion rows.

Most (99.9%) of the time

13条回答
  •  庸人自扰
    2020-12-12 10:24

    First, there isn't 365 trading days in the year, with holidays 52 weekends (104) = say 250 x the actual hours of day market is opened like someone said, and to use the symbol as the primary key is not a good idea since symbols change, use a k_equity_id (numeric) with a symbol (char) since symbols can be like this A , or GAC-DB-B.TO , then in your data tables of price info, you have, so your estimate of 7.3 billion is vastly over calculated since it's only about 1.7 million rows per symbol for 14 years.

    k_equity_id k_date k_minute

    and for the EOD table (that will be viewed 1000x over the other data)

    k_equity_id k_date

    Second, don't store your OHLC by minute data in the same DB table as and EOD table (end of day) , since anyone wanting to look at a pnf, or line chart, over a year period , has zero interest in the by the minute information.

提交回复
热议问题