问题
I spent too much time on this searching for documentation or adequate example to no avail. Kindly someone enlighten me how to deal with this problem.
Say I have the following table of orders for buying a stock. They will end at the designated time.
orders:([] seq:10*1+til 5; ID:5#`softbank;start:11:00 10:00 09:00 13:30 18:00;end:13:30 12:30 11:30 14:30 19:00)
For some reason I am hoping to find the maximum number of orders
alive (say none are transacted) at a sub-time interval within the given time range between start and end. This is a very typical problem to test OOP implementation skills.. to sort and deduct / add under if-else
condition of time match with start time in the sequence in O(nlogn).
id 10: 3 (11-11:30 3, 11:30 -12 2(id 10/30), 12-12:30 2(id 10/20), 12:30 - 13:30 1)
id 20: 2 (10-11 2 (id 20/30), 11-11:30 3 (id 10,20,30), 11:30-12:30 2 (id 10/20)
id 30: 2 (9-10 1, 10-11:30 2)
id 40: 1
id 50: 1
I can only think to iterate over two loops of start
/end
, with if` condition inside. I also read including the piece that states loops are possible either with atomic variables or vectors of the same length. It cannot be possible the language is this limited though. Can anyone educate me or share with me link that is easy to follow...
回答1:
If I'm understanding the question correctly, you're trying to find how many orders are active (aka overlap) within each time window? If so then you can achieve it like this:
orders:([] seq:10*1+til 5; ID:5#`softbank;start:11:00 10:00 09:00 13:30 18:00;end:13:30 12:30 11:30 14:30 19:00);
/to get the count
q)update active:sum each (start<\:end)&end>\:start from orders
seq ID start end active
-------------------------------
10 softbank 11:00 13:30 3
20 softbank 10:00 12:30 3
30 softbank 09:00 11:30 3
40 softbank 13:30 14:30 1
50 softbank 18:00 19:00 1
/to get the seq numbers
q)update active:seq where each (start<\:end)&end>\:start from orders
seq ID start end active
---------------------------------
10 softbank 11:00 13:30 10 20 30
20 softbank 10:00 12:30 10 20 30
30 softbank 09:00 11:30 10 20 30
40 softbank 13:30 14:30 ,40
50 softbank 18:00 19:00 ,50
The reason you haven't found much documentation or examples on loops/indices/iteration is because kdb isn't designed for that approach. To make good use of kdb you have to avoid these concepts
EDIT - additional approach based on comments.
/intervals
iv:distinct asc raze orders`start`end;
/overlaps
update o:{1_{y,x}prior iv where(iv>=x)&iv<=y}'[start;end] from `orders;
/intersections
q)update o:o#\:{u[`seq]group(u:ungroup[x])`o}orders from orders
seq ID start end o
-------------------------------------------------------------------------------------
10 softbank 11:00 13:30 (11:00 11:30;11:30 12:30;12:30 13:30)!(10 20 30;10 20;,10)
20 softbank 10:00 12:30 (10:00 11:00;11:00 11:30;11:30 12:30)!(20 30;10 20 30;10 20)
30 softbank 09:00 11:30 (09:00 10:00;10:00 11:00;11:00 11:30)!(,30;20 30;10 20 30)
40 softbank 13:30 14:30 ,13:30 14:30!,,40
50 softbank 18:00 19:00 ,18:00 19:00!,,50
/if you want to know the counts
q)@[;`o;count'']update o:o#\:{u[`seq]group(u:ungroup[x])`o}orders from orders
seq ID start end o
--------------------------------------------------------------------
10 softbank 11:00 13:30 (11:00 11:30;11:30 12:30;12:30 13:30)!3 2 1
20 softbank 10:00 12:30 (10:00 11:00;11:00 11:30;11:30 12:30)!2 3 2
30 softbank 09:00 11:30 (09:00 10:00;10:00 11:00;11:00 11:30)!1 2 3
40 softbank 13:30 14:30 ,13:30 14:30!,1
50 softbank 18:00 19:00 ,18:00 19:00!,1
/if you want to see where the maximum overlap occurred
q)@[;`o;{#[;x]where c=max c:count each x}']update o:o#\:{u[`seq]group(u:ungroup[x])`o}orders from orders
seq ID start end o
-----------------------------------------------
10 softbank 11:00 13:30 ,11:00 11:30!,10 20 30
20 softbank 10:00 12:30 ,11:00 11:30!,10 20 30
30 softbank 09:00 11:30 ,11:00 11:30!,10 20 30
40 softbank 13:30 14:30 ,13:30 14:30!,,40
50 softbank 18:00 19:00 ,18:00 19:00!,,50
来源:https://stackoverflow.com/questions/58961648/double-triple-for-loops-using-index-of-vectors-that-vary-in-length