subset

Oracle/SQL - Select specified range of sequential records

依然范特西╮ 提交于 2019-12-11 15:59:39
问题 I'm tryint to select a subset of records, 5000 through 10000 from a join. I've gotten queries like this to work in the past, but they were slightly less complex. Here is the query I'm trying to use and if I remove the rownum/rnum references (and therefore the outer select) I receive all my records as expected so I know that logic is good. SELECT * FROM ( SELECT unique cl.riid_, rownum as rnum FROM <table 1> cl, <table 3> mil WHERE cl.opt = 0 AND (cl.st_ != 'QT' OR cl.st_ IS NULL) AND cl.hh =

R: make 2 subset vectors so that values are different index-wise, and also different across each vector

心已入冬 提交于 2019-12-11 15:21:19
问题 Following up on this question, I want to do something similar, but this time I have one more requirement. I want to make 2 vectors subsetting from the same data. I need replace to be set to FALSE because I need all values to be different across a , and all values to be different across b . Apart from that, values cannot be the same in a and b for the same index position. Note that sampling vector v is always fixed, as is the sample length l . Doing the following, I only fulfil one criterium

subsetting Panel Data conditional on consecutive strings of length

懵懂的女人 提交于 2019-12-11 14:33:13
问题 I'm stuck trying to subset some panel data, i.e. ids within group, using dplyr . I want to exact all id s, within each group, grp that has a NUM series with a minimum smaller than 2 and a maximum greater than 2. I've constructed a minimal working example below that should illustrate the issue. I have been working with filter() , row_number() == c(1,n()) , and tried to separate it out and merge, i.e. different types of _join , it back together, but I am stuck and I am now turning to the SO

Group-wise subsetting where feasible

 ̄綄美尐妖づ 提交于 2019-12-11 14:30:59
问题 I would like to subset rows of my data library(data.table); set.seed(333); n <- 100 dat <- data.table(id=1:n, group=rep(1:2,each=n/2), x=runif(n,100,120), y=runif(n,200,220), z=runif(n,300,320)) > head(dat) id group x y z 1: 1 1 109.3400 208.6732 308.7595 2: 2 1 101.6920 201.0989 310.1080 3: 3 1 119.4697 217.8550 313.9384 4: 4 1 111.4261 205.2945 317.3651 5: 5 1 100.4024 212.2826 305.1375 6: 6 1 114.4711 203.6988 319.4913 in several stages within each group. I need to automate this and it

How can I skip groups while subsetting with key by in data.table?

帅比萌擦擦* 提交于 2019-12-11 14:06:17
问题 I have this DT: dt=data.table(ID=c(rep(letters[1:2],each=4),'b'),value=seq(1,9)) ID value 1: a 1 2: a 2 3: a 3 4: a 4 5: b 5 6: b 6 7: b 7 8: b 8 9: b 9 I need to eliminate groups while subsetting but only when the data fulfils some condition. Something like this does not work: dt[,{if (.N==4) .SD else NULL v1},by="ID"] So that I need to remove the groups that do not meet the condition. In this example I would like to skip the groups which length is different than 4. So that I get: ID value 1

R finding the first value in a data frame that falls within a given threshold

你说的曾经没有我的故事 提交于 2019-12-11 13:48:50
问题 I am a fairly new user and I need your help with a task that I am stuck on. If my question has been asked/answered before I would be grateful if you could kindly guide me to the relevant page. I have the following data set (lbnp_br) which is optical density (OD) measured over time (in seconds): time OD 1891 -244.6 1891.5 -244.4 1892 -242 1892.5 -242 1893 -241.1 1893.5 -242.4 1894 -245.2 1894.5 -249.6 **1895 -253.9** 1895.5 -254.5 1896 -251.9 1896.5 -246.7 1897 -242.4 1897.5 -234.6 1898 -225.5

Subsetting columns works on data.frame but not on data.table

一笑奈何 提交于 2019-12-11 13:16:45
问题 I can select a few columns from a data.frame : > z[c("events","users")] events users 1 26246016 201816 2 942767 158793 3 29211295 137205 4 30797086 124314 but not from a data.table : > best[c("events","users")] Starting binary search ...Error in `[.data.table`(best, c("events", "users")) : typeof x.pixel_id (integer) != typeof i.V1 (character) Calls: [ -> [.data.table What do I do? Is there a better way than to turn the data.table back into a data.frame? 回答1: Column subsetting should be done

Creating a representative sample from a large CSV

≯℡__Kan透↙ 提交于 2019-12-11 13:01:40
问题 I have the following dataset: head -2 trip_data_1.csv medallion,hack_license,vendor_id,rate_code,store_and_fwd_flag,pickup_datetime,dropoff_datetime,passenger_count,trip_time_in_secs,trip_distance,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude 89D227B655E5C82AECF13C3F540D4CF4,BA96DE419E711691B9445D6A6307C170,CMT,1,N,2013-01-01 15:11:48,2013-01-01 15:18:10,4,382,1.00,-73.978165,40.757977,-73.989838,40.751171 A simple count of records by date gives me the following output:

R Selecting column in a data frame by column in another data frame

一曲冷凌霜 提交于 2019-12-11 12:30:05
问题 I am facing a problem when trying to subset my data, maybe you could help me. What I need is to subset data from first data frame by a column when this column value is equal to the value of a column in the second data frame. The following are the dataframes I'm using: > head(places) Zona Poble lat lon alt 1 1 Zorita 40.7353 -0.165748 691.867 2 1 Morella 40.6287 -0.113284 955.719 3 1 Forcall 40.6621 -0.209759 753.882 4 2 Benasal 40.3943 -0.126111 848.171 5 2 Cati 40.4532 0.060409 667.610 6 2

What's the most efficient algorithm for generating all k-subsetsof an n-set?

别说谁变了你拦得住时间么 提交于 2019-12-11 12:14:46
问题 We are given a set of n elements and we'd like to generate all k -subsets this set. For example, if S={1,2,3} and k=2 , then the answer would be {1,2}, {1,3}, {2,3} (order not important). There are {n choose k} k -subsets of an n -set (by definition :-), which is O(n^k) (although this is not tight). Obviously any algorithm for the problem will have to run in time Omega({n choose k}) . What is the currently fastest known algorithm for this problem? Can the lower bound of {n choose k} actually