subset | 易学教程

Regression on a subset in R

阅读更多关于 Regression on a subset in R

问题 I want to run the same regression for different countries (i.e. subsets of my data). I did figure out how to do in R, but after doing the same thing with much more ease in Stata, I wonder if there's a better way in R. In Stata you would do something like this: foreach country in USA UK France { reg y x1 x2 if country == "`country'" } Simple and human-readable, right? In R, I come up with split and ddply methods, both are more complicated. To use split data.subset <- split(data, data$country)

optimal way to find sum(S) of all contiguous sub-array's max difference

阅读更多关于 optimal way to find sum(S) of all contiguous sub-array's max difference

问题 You are given an array with n elements: d[0], d[1], ..., d[n-1] . Calculate the sum(S) of all contiguous sub-array's max difference. Formally: S = sum{max{d[l,...,r]} - min{d[l, ..., r}} ,∀ 0 <= l <= r < n Input: 4 1 3 2 4 Output: 12 Explanation: l = 0; r = 0; array: [1] sum = max([1]) - min([1]) = 0 l = 0; r = 1; array: [1,3] sum = max([1,3]) - min([1,3]) = 3 - 1 = 2 l = 0; r = 2; array: [1,3,2] sum = max([1,3,2]) - min([1,3,2]) = 3 - 1 = 2 l = 0;r = 3; array: [1,3,2,4] sum = max([1,3,2,4])

Subsetting a data.table using another data.table

阅读更多关于 Subsetting a data.table using another data.table

问题 I have the dt and dt1 data.table s. dt<-data.table(id=c(rep(2, 3), rep(4, 2)), year=c(2005:2007, 2005:2006), event=c(1,0,0,0,1)) dt1<-data.table(id=rep(2, 5), year=c(2005:2009), performance=(1000:1004)) dt id year event 1: 2 2005 1 2: 2 2006 0 3: 2 2007 0 4: 4 2005 0 5: 4 2006 1 dt1 id year performance 1: 2 2005 1000 2: 2 2006 1001 3: 2 2007 1002 4: 2 2008 1003 5: 2 2009 1004 I would like to subset the former using the combination of its first and second column that also appear in dt1 . As a

Removing rows based on column in another dataframe [duplicate]

阅读更多关于 Removing rows based on column in another dataframe [duplicate]

问题 This question already has answers here : How to subset a data frame based on another data frame in base R (2 answers) Closed 3 years ago . Is there a way to remove rows from a dataframe, based on the column of another dataframe? For example, Dataframe 1: Gene CHROM POS REF ALT N_INFORMATIVE Test Beta SE AAA 1 15211 T G 1481 1:15211 -0.0599805 0.112445 LLL 1 762061 T A 1481 1:762061 0.2144100 0.427085 CCC 1 762109 C T 1481 1:762109 0.2847510 0.204255 DDD 1 762273 G A 1481 1:762273 0.0443946 0

R: Efficiently subsetting dataframe based on time of day

阅读更多关于 R: Efficiently subsetting dataframe based on time of day

问题 I have a large (150,000x7) dataframe that I intend to use for back-testing and real-time analysis of a financial market. The data represents the condition of an investment vehicle at 5 minute intervals ( although holes do exist ). It looks like this (but much longer): pTime Time Price M1 M2 M3 M4 1 1212108300 20:45:00 1.5518 12.21849 -0.37125 4.50549 -31.00559 2 1212108900 20:55:00 1.5516 11.75350 -0.81792 -1.53846 -32.12291 3 1212109200 21:00:00 1.5512 10.75070 -1.47438 -8.24176 -34.35754 4

Check if string is subset of a bunch of characters? (RegEx)?

阅读更多关于 Check if string is subset of a bunch of characters? (RegEx)?

I have a little problem, I have 8 characters, for example "a b c d a e f g", and a list of words, for example: mom, dad, bad, fag, abac How can I check if I can or cannot compose these words with the letters I have? In my example, I can compose bad, abac and fag, but I cannot compose dad (I have not two D) and mom (I have not M or O). I'm pretty sure it can be done using a RegEx but would be helpful even using some functions in Perl.. Thanks in advance guys! :) This is done most simply by forming a regular expression from the word that is to be tested. This sorts the list of available

Subsetting winter (Dez, Jan, Feb) from daily time series (zoo)

阅读更多关于 Subsetting winter (Dez, Jan, Feb) from daily time series (zoo)

I have a daily zoo (xts) with a few decades of data in the following format: head(almorol) 1973-10-02 1973-10-03 1973-10-04 1973-10-05 1973-10-06 1973-10-07 183.9 208.2 153.7 84.8 52.5 35.5 and I would like to plot just winter data (the full months of December, January and February). I found the subsetting for xts so I thought I could extract all the Decembers using: x<-apply.yearly(almorol, FUN=last(almorol, "1 month")) and then do something similar for Jan and Feb, but I get the following error: Error in get(as.character(FUN), mode = "function", envir = envir) : object 'FUN' of mode

Subsetting winter (Dez, Jan, Feb) from daily time series (zoo)

阅读更多关于 Subsetting winter (Dez, Jan, Feb) from daily time series (zoo)

问题 I have a daily zoo (xts) with a few decades of data in the following format: head(almorol) 1973-10-02 1973-10-03 1973-10-04 1973-10-05 1973-10-06 1973-10-07 183.9 208.2 153.7 84.8 52.5 35.5 and I would like to plot just winter data (the full months of December, January and February). I found the subsetting for xts so I thought I could extract all the Decembers using: x<-apply.yearly(almorol, FUN=last(almorol, "1 month")) and then do something similar for Jan and Feb, but I get the following

Copying a subset of an array into another array / array slicing in C

阅读更多关于 Copying a subset of an array into another array / array slicing in C

In C, is there any built-in array slicing mechanism? Like in Matlab for example, A(1:4) would produce = 1 1 1 1 How can I achieve this in C? I tried looking, but the closest I could find is this: http://cboard.cprogramming.com/c-programming/95772-how-do-array-subsets.html subsetArray = &bigArray[someIndex] But this does not exactly return the sliced array, instead pointer to the first element of the sliced array... Many thanks Doing that in std C is not possible. You have to do it yourself. If you have a string, you can use string.h library who takes care of that, but for integers there's no

R selecting all rows from a data frame that don't appear in another

阅读更多关于 R selecting all rows from a data frame that don't appear in another

I'm trying to solve a tricky R problem that I haven't been able to solve via Googling keywords. Specifically, I'm trying to take a subset one data frame whose values don't appear in another. Here is an example: > test number fruit ID1 ID2 item1 "number1" "apples" "22" "33" item2 "number2" "oranges" "13" "33" item3 "number3" "peaches" "44" "25" item4 "number4" "apples" "12" "13" > test2 number fruit ID1 ID2 item1 "number1" "papayas" "22" "33" item2 "number2" "oranges" "13" "33" item3 "number3" "peaches" "441" "25" item4 "number4" "apples" "123" "13" item5 "number3" "peaches" "44" "25" item6