KDB query performance improvement

大城市里の小女人 提交于 2019-12-24 07:47:16

问题


I have a simple table containing prices that I'm using for stock algo back testing.

price_hist:([pxkey:`$()]price:`float$())
update `g#pxkey from `price_hist

pxkey is a concatenated string in the format 'MSFT_5M_201710060945', so stock=MSFT, price bar intervals=5 mins and datetime=201710060945. I used the concatenated string instead of individual columns because it's simple and I'm a KDB novice and I wanted to get something running quickly.

I have about 5 million rows in there and the performance is only marginally faster than MySql using the exact same data. Any ideas on how to improve this (either thru table structure, attributes, query, anything..)? FYI I'm using C# with qSharp library and to query i'm using this format which returns a dictionary:-

price_hist`MSFT_5M_201710060945

回答1:


Creating millions of generated symbols is never a good idea in kdb+. I would recommend using a keyed table instead of a dictionary:

bar5m:([sym:`$();time:`timestamp$()]price:`float$())

Once you populate it, you should be able to query it as follows

bar5m[(`MSFT;2017.10.06D09:45);`price]

To improve the performance, make sure the table is sorted by sym,time and put the p attribute on sym.



来源:https://stackoverflow.com/questions/47447662/kdb-query-performance-improvement

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!