kdb

qPython - Type conversion of kdb response data

不问归期 提交于 2019-12-25 03:43:17
问题 When I run a q query using qPython, I am able to return the data in a pandas data frame. What I am struggling with are the types of the "string" columns, i.e. columns that are presented as simple or mixed (character) lists in q. Their dtype is object and the values are represented in the form b'ab34knadke'. What I would like to have, however, is just the "ab34knadke"-part as a string. I have looked at the docs for qPython but I am struggling to fully get the pandas and reader components. Any

KDB:selecting data “around” time of certain events

怎甘沉沦 提交于 2019-12-25 02:29:39
问题 Consider a huge table of market data T. I am particularly interested in rows where Status=`SSS. However, in addition to the rows given by (select from T where Status=`SSS), I also would like to select the 10 records that come both immediately before and after these rows. (Note that in some cases, these intervals may overlap). What is an efficient way to do this? Note that I tried something like this below, and it nearly crashed my port and hogged up all the memory. select from update diff:min

KDB query performance improvement

大城市里の小女人 提交于 2019-12-24 07:47:16
问题 I have a simple table containing prices that I'm using for stock algo back testing. price_hist:([pxkey:`$()]price:`float$()) update `g#pxkey from `price_hist pxkey is a concatenated string in the format 'MSFT_5M_201710060945', so stock=MSFT, price bar intervals=5 mins and datetime=201710060945. I used the concatenated string instead of individual columns because it's simple and I'm a KDB novice and I wanted to get something running quickly. I have about 5 million rows in there and the

How to send a data.frame from R to Q/KDB?

假装没事ソ 提交于 2019-12-22 08:33:59
问题 I have a large data.frame (15 columns and 100,000 rows) in an existing R session that I want to send to a Q/KDB instance. From KDB's cookbook, the possible solutions are: RServer for Q: use KDB to create new R instance which shares memory space. This doesn't work because my data is in an existing instance of R. RServe: run an R server and use TCP/IP to communicate with Q/KDB client. This does not work, because as per RServe's documentation, " every connection has a separate workspace and

kdb/q: how to reshape a list into nRows, where nRows is a variable

早过忘川 提交于 2019-12-14 03:00:55
问题 If I am to split a list into 2 rows, I can use: q)2 0N#til 10 However, the following syntax does not work: q)n:2 q)n 0N#til 10 how I can achieve such reshaping? 回答1: Need brackets and semi colon q)2 0N#til 10 0 1 2 3 4 5 6 7 8 9 q)n:2 q)(n;0N)#til 10 0 1 2 3 4 5 6 7 8 9 回答2: Here is the general syntax to split a list in matrix form: (list1)#(list2) As you can see, left part and right part of '#' is list . So here is one example: q)list1: (4;3) / or simply (4 3) q)list2: til 12 q)list1#list2

kdb+: replace null integer with 0

萝らか妹 提交于 2019-12-12 11:44:18
问题 Consider the following table: myTable: a b ------- 1 2 3 10 4 50 5 30 How do I replace the empty cells of b with a zero? So the result would be: a b ------- 1 0 2 0 3 10 4 50 5 30 Right now I'm doing: myTable: update b:{$[x~0Ni;0;x]}'b from myTable But I am wondering whether there is a better/easier solution for doing this. 回答1: Using the fill operator ( ^ ) Example Table: q) tbl:flip`a`b!(2;0N)#10?0N 0N 0N,til 3 a b --- 0 2 1 1 1 1 1 1 Fill nulls in all columns with 0: q)0^tbl a b --- 0 2 1

How to ungroup list columns in data.table?

无人久伴 提交于 2019-12-12 09:56:03
问题 tidyr provides the unnest function that help expanding list columns. This is similar to the much (20x) faster ungroup function in kdb. I am looking for a similar (but much faster) function that, assuming a data.table that contains several list columns, each with the same number of element on each row, would expand the data.table. This an extension of this post. library(data.table) library(tidyr) t = Sys.time() DT = data.table(a=c(1,2,3), b=c('q','w','e'), c=list(rep(t,2),rep(t+1,3),rep(t,0)),

Iterate over current row values in kdb query

纵饮孤独 提交于 2019-12-12 02:39:13
问题 Consider the table: q)trade stock price amt time ----------------------------- ibm 121.3 1000 09:03:06.000 bac 5.76 500 09:03:23.000 usb 8.19 800 09:04:01.000 and the list: q)x: 10000 20000 The following query: q)select from trade where price < x[first where (x - price) > 100f] 'length fails as above. How can I pass the current row value of price in each iteration of the search query? While price[0] in the square brackets above works, that's obviously not what I want. I even tried price[i]

double - triple for loops using index of vectors that vary in length

不羁的心 提交于 2019-12-11 15:49:58
问题 I spent too much time on this searching for documentation or adequate example to no avail. Kindly someone enlighten me how to deal with this problem. Say I have the following table of orders for buying a stock. They will end at the designated time. orders:([] seq:10*1+til 5; ID:5#`softbank;start:11:00 10:00 09:00 13:30 18:00;end:13:30 12:30 11:30 14:30 19:00) For some reason I am hoping to find the maximum number of orders alive (say none are transacted) at a sub-time interval within the

extract number from string in kdb

前提是你 提交于 2019-12-11 08:41:36
问题 I am quite new to kdb+q. I've come across this problem of extracting a number out of string. Any suggestions? Example: "AZXER_1234_MARKET" should output 1234 //Assume that there is only one number in the string 回答1: Extract the numbers then cast to required type. q){"I"$x inter .Q.n} "AZXER_1234_MARKET" 1234i q){"I"$x inter .Q.n} "AZXER_123411_MARKET" 123411i q){"I"$x inter .Q.n} "AZXER_1234_56_MARKET" 123456i q){"I"$x inter .Q.n} "AR_34_56_MAT" 3456i 回答2: If you have multiple numbers, here