问题
In KDB I have a table with 4 columns, of which one is a returns column. I would like to create a 5th column that calculates the variance (var) of the past x elements of a specified column. I have managed this with two columns creating the returns of a price column, but am stuck when I need to reference more than one previous element.
Example:
t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; px:121 125 127 126 129 130)
t:update retLogPcnt:100*log px%prev px from t
/t:update retClassic:((px-prev px)%prev px) from t
t:update mvAvgRet:2 mavg retLogPcnt from t
t
Output of last line:
td px retLogPcnt mvAvgRet
-----------------------------------
2001.01.01 121
2001.01.02 125 3.252319 3.252319
2001.01.03 127 1.587335 2.419827
2001.01.04 126 -0.790518 0.3984085
2001.01.05 129 2.35305 0.7812659
2001.01.06 130 0.7722046 1.562627
Desired output if calculating, as an illustration only, the var of the three last retLogPcnt numbers:
td px retLogPcnt mvAvgRet varRetns
---------------------------------------------
2001.01.01 121
2001.01.02 125 3.252319 3.252319
2001.01.03 127 1.587335 2.419827
2001.01.04 126 -0.790518 0.3984085 2.752321
2001.01.05 129 2.35305 0.7812659 1.791392
2001.01.06 130 0.7722046 1.562627 1.647022
So the first var number has been calculated by:
q)var 3.252319 1.587335 -0.790518
How can I add this extra column? I made some attempts similar to my moving average example above (t:update mvAvgRet:2 mavg retLogPcnt from t
) however I was unsuccessful.
Apologies if this has been answered previously; I did search the forums but could not find anything that helped me. This may well be due to getting to grips with kdb terminology.
回答1:
You may want to check out this page on the wiki that discusses sliding windows. Where it describes the sliding window function swin
with this example:
q)swin:{[f;w;s] f each { 1_x,y }\[w#0;s]}
q)swin[avg; 3; til 10]
0 0.33333333 1 2 3 4 5 6 7 8
This can be modified for your purpose by using a list of floats to start with:
swin:{[f;w;s] f each { 1_x,y }\[w#0f;s]}
Such that:
q)update varRetns:swin[var;3;retLogPcnt] from t
td px retLogPcnt mvAvgRet varRetns
-------------------------------------------
2001.01.01 121 0
2001.01.02 125 3.25232 3.25232 2.64439
2001.01.03 127 1.58733 2.41983 0.693043
2001.01.04 126 -0.790518 0.398408 2.75232
2001.01.05 129 2.35305 0.781266 1.79139
2001.01.06 130 0.772205 1.56263 1.64702
Which is close to your example output except for the initial 3 values not being null. If you wish to exclude these values a further change is required to the var
function, which I have defined as newVar
:
newVar:{$[any null x;0Nf;var x]}
q)update varRetns:swin[newVar;3;retLogPcnt] from t
td px retLogPcnt mvAvgRet varRetns
-------------------------------------------
2001.01.01 121
2001.01.02 125 3.25232 3.25232
2001.01.03 127 1.58733 2.41983
2001.01.04 126 -0.790518 0.398408 2.75232
2001.01.05 129 2.35305 0.781266 1.79139
2001.01.06 130 0.772205 1.56263 1.64702
来源:https://stackoverflow.com/questions/48122537/apply-var-to-the-last-n-numbers-of-a-column-in-a-table