subset

An array is subset of another array

橙三吉。 提交于 2019-11-29 16:17:55
How can I efficiently check to see whether all the elements in an integer array are subset of all elements of another Array in java? For example [33 11 23] is subset of [11 23 33 42]. Thanks in advance. Make a HashSet out of the superset array. Check if each of the elements of the subset array are contained in the HashSet . This is a very fast operation. If you're not bound to using Arrays, any Java collection has the containsAll method: boolean isSubset = bigList.containsAll(smallList); This will do exactly what you want, efficiently. assume you want to check A is subset of B. put each

Brackets make a vector different. How exactly is vector expression evaluated?

[亡魂溺海] 提交于 2019-11-29 15:47:11
I have a data frame as follows: planets type diameter rotation rings Mercury Terrestrial planet 0.382 58.64 FALSE Venus Terrestrial planet 0.949 -243.02 FALSE Earth Terrestrial planet 1.000 1.00 FALSE Mars Terrestrial planet 0.532 1.03 FALSE Jupiter Gas giant 11.209 0.41 TRUE Saturn Gas giant 9.449 0.43 TRUE Uranus Gas giant 4.007 -0.72 TRUE Neptune Gas giant 3.883 0.67 TRUE I wanted to select last 3 rows: planets_df[nrow(planets_df)-3:nrow(planets_df),] However, I've got something I didn't expect: planets type diameter rotation rings Jupiter Gas giant 11.209 0.41 TRUE Mars Terrestrial planet

r functions calling lm with subsets

随声附和 提交于 2019-11-29 15:23:36
I was working on some code and I noticed something peculiar. When I run LM on a subset of some panel data I have it works fine, something like this: library('plm') data(Cigar) lm(log(price) ~ log(pop) + log(ndi), data=Cigar, subset=Cigar$state==1) Call: lm(formula = log(price) ~ log(pop) + log(ndi), data = Cigar, subset = Cigar$state == 1) Coefficients: (Intercept) log(pop) log(ndi) -26.4919 3.2749 0.4265 but when I try to wrap this in a function I get: myfunction <- function(formula, data, subset){ return(lm(formula, data, subset)) } myfunction(formula = log(price) ~ log(pop) + log(ndi), data

How to subset data.frames stored in a list?

十年热恋 提交于 2019-11-29 15:12:52
问题 I created a list and I stored one data frame in each component. Now I would like to filter those data frames keeping only the rows that have NA in a specific column. I would like the result of this operation to be another list containing data frames with only those rows having NA in that column. Here is some code to clarify what I am saying. Assume d1 and d2 are my data frames set.seed(1) d1<-data.frame(a=rnorm(5), b=c(rep(2006, times=4),NA)) d2<-data.frame(a=1:5, b=c(2007, 2007, NA, NA, 2007

Subsetting R array: dimension lost when its length is 1

大城市里の小女人 提交于 2019-11-29 13:37:13
When subsetting arrays, R behaves differently depending on whether one of the dimensions is of length 1 or not. If a dimension has length 1, that dimension is lost during subsetting: ax <- array(1:24, c(2,3,4)) ay <- array(1:12, c(1,3,4)) dim(ax) #[1] 2 3 4 dim(ay) #[1] 1 3 4 dim(ax[,1:2,]) #[1] 2 2 4 dim(ay[,1:2,]) #[1] 2 4 From my point of view, ax and ay are the same, and performing the same subset operation on them should return an array with the same dimensions. I can see that the way that R is handling the two cases might be useful, but it's undesirable in the code that I'm writing. It

Replace all values of a recursive list with values of a vector

巧了我就是萌 提交于 2019-11-29 13:09:51
Say, I have the following recursive list: rec_list <- list(list(rep(1,5), 10), list(rep(100, 4), 20:25)) rec_list [[1]] [[1]][[1]] [1] 1 1 1 1 1 [[1]][[2]] [1] 10 [[2]] [[2]][[1]] [1] 100 100 100 100 [[2]][[2]] [1] 20 21 22 23 24 25 Now, I would like to replace all the values of the list, say, with the vector seq_along(unlist(rec_list)) , and keep the structure of the list. I tried using the empty index subsetting like rec_list[] <- seq_along(unlist(rec_list)) But this doesn't work. How can I achieve the replacement while keeping the original structure of the list? You can use relist : relist

Referencing a dataframe recursively

旧时模样 提交于 2019-11-29 12:40:41
问题 Is there a way to have a dataframe refer to itself? I find myself spending a lot of time writing things like y$Category1[is.na(y$Category1)]<-NULL which are hard to read and feel like a lot of slow repetitive typing. I wondered if there was something along the lines of: y$Category1[is.na(self)] <- NULL I could use instead. Thanks 回答1: What a great question. Unfortunately, as @user295691 pointed out in the coments, the issue is with regards to referencing a vector twice: once as the object

Subset by multiple ranges [duplicate]

筅森魡賤 提交于 2019-11-29 11:38:00
This question already has an answer here: Efficient way to filter one data frame by ranges in another 3 answers I want to get a list of values that fall in between multiple ranges. library(data.table) values <- data.table(value = c(1:100)) range <- data.table(start = c(6, 29, 87), end = c(10, 35, 92)) I need the results to include only the values that fall in between those ranges: results <- c(6, 7, 8, 9, 10, 29, 30, 31, 32, 33, 34, 35, 87, 88, 89, 90, 91, 92) I am currently doing this with a for loop, results <- data.table(NULL) for (i in 1:NROW(range){ results <- rbind(results, data.table

Test if set is a subset, considering the number (multiplicity) of each element in the set

走远了吗. 提交于 2019-11-29 11:19:53
I know I can test if set1 is a subset of set2 with: {'a','b','c'} <= {'a','b','c','d','e'} # True But the following is also True: {'a','a','b','c'} <= {'a','b','c','d','e'} # True How do I have it consider the number of times an element in the set occurs so that: {'a','b','c'} <= {'a','b','c','d','e'} # True {'a','a','b','c'} <= {'a','b','c','d','e'} # False since 'a' is in set1 twice but set2 only once {'a','a','b','c'} <= {'a','a','b','c','d','e'} # True because both sets have two 'a' elements I know I could do something like: A, B, C = ['a','a','b','c'], ['a','b','c','d','e'], ['a','a','b',

subsetting in xts using a parameter holding dates

百般思念 提交于 2019-11-29 11:15:33
I am familiar with the xts subsetting abilities. However, I can't find an elegant way to subset a parameterized range of dates. something like this: times = c(as.POSIXct("2012-11-03 09:45:00 IST"), as.POSIXct("2012-11-05 09:45:00 IST")) #create an xts object: xts.obj = xts(c(1,2),order.by = times) #filter with these dates: start.date = as.POSIXct("2012-11-03") end.date = as.POSIXct("2012-11-04") #instead of xts["2012-11-03"/"2012-11-04"], do something like this: xts[start.date:end.date] Does anybody have any idea? Thanks! You could paste the start.date and end.date objects together, separating