kdb | 易学教程

qPython - Type conversion of kdb response data

阅读更多关于 qPython - Type conversion of kdb response data

问题 When I run a q query using qPython, I am able to return the data in a pandas data frame. What I am struggling with are the types of the "string" columns, i.e. columns that are presented as simple or mixed (character) lists in q. Their dtype is object and the values are represented in the form b'ab34knadke'. What I would like to have, however, is just the "ab34knadke"-part as a string. I have looked at the docs for qPython but I am struggling to fully get the pandas and reader components. Any

KDB:selecting data “around” time of certain events

阅读更多关于 KDB:selecting data “around” time of certain events

问题 Consider a huge table of market data T. I am particularly interested in rows where Status=`SSS. However, in addition to the rows given by (select from T where Status=`SSS), I also would like to select the 10 records that come both immediately before and after these rows. (Note that in some cases, these intervals may overlap). What is an efficient way to do this? Note that I tried something like this below, and it nearly crashed my port and hogged up all the memory. select from update diff:min

KDB query performance improvement

阅读更多关于 KDB query performance improvement

问题 I have a simple table containing prices that I'm using for stock algo back testing. price_hist:([pxkey:`$()]price:`float$()) update `g#pxkey from `price_hist pxkey is a concatenated string in the format 'MSFT_5M_201710060945', so stock=MSFT, price bar intervals=5 mins and datetime=201710060945. I used the concatenated string instead of individual columns because it's simple and I'm a KDB novice and I wanted to get something running quickly. I have about 5 million rows in there and the

How to send a data.frame from R to Q/KDB?

阅读更多关于 How to send a data.frame from R to Q/KDB?

问题 I have a large data.frame (15 columns and 100,000 rows) in an existing R session that I want to send to a Q/KDB instance. From KDB's cookbook, the possible solutions are: RServer for Q: use KDB to create new R instance which shares memory space. This doesn't work because my data is in an existing instance of R. RServe: run an R server and use TCP/IP to communicate with Q/KDB client. This does not work, because as per RServe's documentation, " every connection has a separate workspace and

kdb/q: how to reshape a list into nRows, where nRows is a variable

阅读更多关于 kdb/q: how to reshape a list into nRows, where nRows is a variable

问题 If I am to split a list into 2 rows, I can use: q)2 0N#til 10 However, the following syntax does not work: q)n:2 q)n 0N#til 10 how I can achieve such reshaping? 回答1: Need brackets and semi colon q)2 0N#til 10 0 1 2 3 4 5 6 7 8 9 q)n:2 q)(n;0N)#til 10 0 1 2 3 4 5 6 7 8 9 回答2: Here is the general syntax to split a list in matrix form: (list1)#(list2) As you can see, left part and right part of '#' is list . So here is one example: q)list1: (4;3) / or simply (4 3) q)list2: til 12 q)list1#list2

kdb+: replace null integer with 0

阅读更多关于 kdb+: replace null integer with 0

问题 Consider the following table: myTable: a b ------- 1 2 3 10 4 50 5 30 How do I replace the empty cells of b with a zero? So the result would be: a b ------- 1 0 2 0 3 10 4 50 5 30 Right now I'm doing: myTable: update b:{$[x~0Ni;0;x]}'b from myTable But I am wondering whether there is a better/easier solution for doing this. 回答1: Using the fill operator ( ^ ) Example Table: q) tbl:flip`a`b!(2;0N)#10?0N 0N 0N,til 3 a b --- 0 2 1 1 1 1 1 1 Fill nulls in all columns with 0: q)0^tbl a b --- 0 2 1

How to ungroup list columns in data.table?

阅读更多关于 How to ungroup list columns in data.table?

问题 tidyr provides the unnest function that help expanding list columns. This is similar to the much (20x) faster ungroup function in kdb. I am looking for a similar (but much faster) function that, assuming a data.table that contains several list columns, each with the same number of element on each row, would expand the data.table. This an extension of this post. library(data.table) library(tidyr) t = Sys.time() DT = data.table(a=c(1,2,3), b=c('q','w','e'), c=list(rep(t,2),rep(t+1,3),rep(t,0)),

Iterate over current row values in kdb query

阅读更多关于 Iterate over current row values in kdb query

问题 Consider the table: q)trade stock price amt time ----------------------------- ibm 121.3 1000 09:03:06.000 bac 5.76 500 09:03:23.000 usb 8.19 800 09:04:01.000 and the list: q)x: 10000 20000 The following query: q)select from trade where price < x[first where (x - price) > 100f] 'length fails as above. How can I pass the current row value of price in each iteration of the search query? While price[0] in the square brackets above works, that's obviously not what I want. I even tried price[i]

double - triple for loops using index of vectors that vary in length

阅读更多关于 double - triple for loops using index of vectors that vary in length

问题 I spent too much time on this searching for documentation or adequate example to no avail. Kindly someone enlighten me how to deal with this problem. Say I have the following table of orders for buying a stock. They will end at the designated time. orders:([] seq:10*1+til 5; ID:5#`softbank;start:11:00 10:00 09:00 13:30 18:00;end:13:30 12:30 11:30 14:30 19:00) For some reason I am hoping to find the maximum number of orders alive (say none are transacted) at a sub-time interval within the

extract number from string in kdb

阅读更多关于 extract number from string in kdb

问题 I am quite new to kdb+q. I've come across this problem of extracting a number out of string. Any suggestions? Example: "AZXER_1234_MARKET" should output 1234 //Assume that there is only one number in the string 回答1: Extract the numbers then cast to required type. q){"I"$x inter .Q.n} "AZXER_1234_MARKET" 1234i q){"I"$x inter .Q.n} "AZXER_123411_MARKET" 123411i q){"I"$x inter .Q.n} "AZXER_1234_56_MARKET" 123456i q){"I"$x inter .Q.n} "AR_34_56_MAT" 3456i 回答2: If you have multiple numbers, here