subset | 易学教程

Select groups with more than one distinct value

阅读更多关于 Select groups with more than one distinct value

问题 I have data with a grouping variable (\"from\") and values (\"number\"): from number 1 1 1 1 2 1 2 2 3 2 3 2 I want to subset the data and select groups which have two or more unique values. In my data, only group 2 has more than one distinct \'number\', so this is the desired result: from number 2 1 2 2 回答1: Several possibilities, here's my favorite library(data.table) setDT(df)[, if(+var(number)) .SD, by = from] # from number # 1: 2 1 # 2: 2 2 Basically, per each group we are checking if

Finding the subsets of an array in PHP

阅读更多关于 Finding the subsets of an array in PHP

问题 I have a Relational Schema with attributes (A B C D). I have a set of Functional Dependencies with me too. Now I need to determine the closure for all the possible subsets of R\'s attributes. That\'s where I am stuck. I need to learn how to find subsets (non-repeating) in PHP. My Array is stored like this. $ATTRIBUTES = (\'A\', \'B\', \'C\', \'D\'). so my subsets should be $SUBSET = (\'A\', \'B\', \'C\', \'D\', \'AB\', \'AC\', AD\', \'BC\', \'BD\', \'CD\', \'ABC\', \'ABD\', \'BCD\', \'ABCD\')

Find all possible subset combos in an array?

阅读更多关于 Find all possible subset combos in an array?

问题 I need to get all possible subsets of an array with a minimum of 2 items and an unknown maximum. Anyone that can help me out a bit? Say I have this... [1,2,3] ...how do I get this? [ [1,2] , [1,3] , [2,3] , [1,2,3] ] 回答1: After stealing this JavaScript combination generator, I added a parameter to supply the minimum length resulting in, var combine = function(a, min) { var fn = function(n, src, got, all) { if (n == 0) { if (got.length > 0) { all[all.length] = got; } return; } for (var j = 0;

Extract a subset of a dataframe based on a condition involving a field

阅读更多关于 Extract a subset of a dataframe based on a condition involving a field

问题 I have a large CSV with the results of a medical survey from different locations (the location is a factor present in the data). As some analyses are specific to a location and for convenience, I\'d like to extract subframes with the rows only from those locations. It happens that the location is the very first field so yes, I could do it by sorting the CSV rows, but I\'d like to learn how to do it in R as I\'m sure I\'ll need this for other columns. So, in a nutshell, the question is: given

Subset dataframe by multiple logical conditions of rows to remove

阅读更多关于 Subset dataframe by multiple logical conditions of rows to remove

问题 I would like to subset (filter) a dataframe by specifying which rows not ( ! ) to keep in the new dataframe. Here is a simplified sample dataframe: data v1 v2 v3 v4 a v d c a v d d b n p g b d d h c k d c c r p g d v d x d v d c e v d b e v d c For example, if a row of column v1 has a \"b\", \"d\", or \"e\", I want to get rid of that row of observations, producing the following dataframe: v1 v2 v3 v4 a v d c a v d d c k d c c r p g I have been successful at subsetting based on one condition

Extract a subset of a dataframe based on a condition involving a field

阅读更多关于 Extract a subset of a dataframe based on a condition involving a field

I have a large CSV with the results of a medical survey from different locations (the location is a factor present in the data). As some analyses are specific to a location and for convenience, I'd like to extract subframes with the rows only from those locations. It happens that the location is the very first field so yes, I could do it by sorting the CSV rows, but I'd like to learn how to do it in R as I'm sure I'll need this for other columns. So, in a nutshell, the question is: given a data frame foo, how can I create another data frame bar which only contains the rows from foo where foo

How to drop columns by name in a data frame

阅读更多关于 How to drop columns by name in a data frame

问题 I have a large data set and I would like to read specific columns or drop all the others. data <- read.dta(\"file.dta\") I select the columns that I\'m not interested in: var.out <- names(data)[!names(data) %in% c(\"iden\", \"name\", \"x_serv\", \"m_serv\")] and than I\'d like to do something like: for(i in 1:length(var.out)) { paste(\"data$\", var.out[i], sep=\"\") <- NULL } to drop all the unwanted columns. Is this the optimal solution? 回答1: You should use either indexing or the subset

Check whether an array is a subset of another

阅读更多关于 Check whether an array is a subset of another

问题 Any idea on how to check whether that list is a subset of another? Specifically, I have List<double> t1 = new List<double> { 1, 3, 5 }; List<double> t2 = new List<double> { 1, 5 }; How to check that t2 is a subset of t1, using LINQ? 回答1: bool isSubset = !t2.Except(t1).Any(); 回答2: Use HashSet instead of List if working with sets. Then you can simply use IsSubsetOf() HashSet<double> t1 = new HashSet<double>{1,3,5}; HashSet<double> t2 = new HashSet<double>{1,5}; bool isSubset = t2.IsSubsetOf(t1)

Subset data frame based on number of rows per group

阅读更多关于 Subset data frame based on number of rows per group

问题 I have data like this, where some \"name\" occurs more than three times: df <- data.frame(name = c(\"a\", \"a\", \"a\", \"b\", \"b\", \"c\", \"c\", \"c\", \"c\"), x = 1:9) name x 1 a 1 2 a 2 3 a 3 4 b 4 5 b 5 6 c 6 7 c 7 8 c 8 9 c 9 I wish to subset (filter) the data based on number of rows (observations) within each level of the name variable. If a certain level of name occurs more than say 3 times, I want to remove all rows belonging to that level. So in this example, we would drop

How to subset matrix to one column, maintain matrix data type, maintain row/column names?

阅读更多关于 How to subset matrix to one column, maintain matrix data type, maintain row/column names?

问题 When I subset a matrix to a single column, the result is of class numeric, not matrix (i.e. myMatrix[ , 5 ] to subset to the fifth column). Is there a compact way to subset to a single column, maintain the matrix format, and maintain the row/column names without doing something complicated like: matrix( myMatrix[ , 5 ] , dimnames = list( rownames( myMatrix ) , colnames( myMatrix )[ 5 ] ) 回答1: Use the drop=FALSE argument to [ . m <- matrix(1:10,5,2) rownames(m) <- 1:5 colnames(m) <- 1:2 m[,1]