subset

Error when calculating values greater than 95% quantile using plyr

Deadly 提交于 2019-12-02 10:49:46
My data is structured as follows: Individ <- data.frame(Participant = c("Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Bill", "Harry", "Harry", "Harry", "Harry","Harry", "Harry", "Harry", "Harry", "Paul", "Paul", "Paul", "Paul"), Time = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4), Condition = c("Placebo", "Placebo", "Placebo", "Placebo", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", "Placebo", "Placebo", "Placebo", "Placebo", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr", "Expr"), Power = c(400,

subsetting based on number of observations in a factor variable

半城伤御伤魂 提交于 2019-12-02 10:44:53
how do you subset based on the number of observations of the levels of a factor variable? I have a dataset with 1,000,000 rows and nearly 3000 levels, and I want to subset out the levels with less say 200 observations. data <- read.csv("~/Dropbox/Shared/data.csv", sep=";") summary(as.factor(data$factor) 10001 10002 10003 10004 10005 10006 10007 10009 10010 10011 10012 10013 10014 10016 10017 10018 10019 10020 414 741 2202 205 159 591 194 678 581 774 778 738 1133 997 381 157 522 6 10021 10022 10023 10024 10025 10026 10027 10028 10029 10030 10031 10032 10033 10034 10035 10036 10037 10038 398 416

R: subsetting data frame by both certain column names (as a variable) and field values

喜夏-厌秋 提交于 2019-12-02 10:34:20
I have list of names and I have a data frame with colnames that match sometimes the names in the list. Now I want to subset the data frame based on two criteria: the colnames (as a variable) in the list and the values of the fields in those columns. I tried it this way: names.list <- c("name1", "name2" , "name5") names <- as.data.frame(names.list) df <- *dataframe with colnames "name1", "name2", "name3", "name4", etc.* for (i in 1:nrow(names)){ name <- names[i,1] df <- subset(df, name > 1.5) } I know this is the wrong way, but I haven't figured out yet to do it properly. Does anyone know how

R: from a vector, list all subsets of elements so their sum just passes a value

不问归期 提交于 2019-12-02 10:00:38
Sorry in advance if the answer (1) is trivial; or (2) out there but I haven't been able to solve this issue or find and answer online. Any pointers will be much appreciated! I am in need of a piece of code that can run through a vector and return all possible subsets of elements whose cumulative sum passes a threshold value. Note that I do not want only the subsets that give me exactly the threshold. The cumulative sum can be above the threshold, as long as the algorithm stops adding an extra element if the value has been achieved already. # A tiny example of the kind of input data. # However,

Bulk update in subset obtained from dataframe filtering [duplicate]

℡╲_俬逩灬. 提交于 2019-12-02 09:34:23
This question already has an answer here: Updating a subset of a dataframe 2 answers I have a dataframe, which I filter based on 2 condition as follow: subset(sales_data, month == 'Jan' & dept_name == 'Production')` I want to bulk update the value of a particular column(let's say status ) of above subset Something like subset(sales_data, month == 'Jan' & dept_name == 'Production')["status"] <- "Good result"` I am not sure, how I can do this. You could do sales_data$status[ sales_data$month == 'Jan' & sales_data$dept_name == 'Production'] <- "Good result" Using replace sales_data$status <- with

Logical condition while subsetting not giving correct values

爷,独闯天下 提交于 2019-12-02 09:32:31
I wanted to subset data frame project I was working with, using a logical. I am getting a paradoxical result. The part of the logical preceding the ROLL.NO. argument is irrelevant to the question. Sorry, I could not give a reproducible example. Do let me know how can I make this question reproducible without having to show the entire 393 entries of the relevant columns in my data frame. D14 and DC31 are simple integer values, with some values being NA . culprits<-project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)]

Generate List with Combination of Subset of List, Java

廉价感情. 提交于 2019-12-02 07:45:36
This question is to be implemented in Java. I have a class named Competitor, with Type, Name and Power. public class Competitor { private final int type; private final String name; private final int power; public Competitor(int type, String name, int power) { this.type = type; this.name = name; this.power = power; } public int getType() { return type; } public String getName() { return name; } public int getPower() { return power; } @Override public String toString() { return "Competitor{" + "type=" + type + ", name=" + name + ", power=" + power + '}'; } } Now, I want to do a game, with ONE

subsetting matrix with id from another matrix

[亡魂溺海] 提交于 2019-12-02 06:26:04
I would like to subset the data of one matrix using data in a second matrix. The columns of one matrix is labeled. For example, area1 <- c(9836374,635440,23018,833696,936079,1472449,879042,220539,870581,217418,552303,269359,833696,936079,1472449,879042,220539,870581, 833696,936079,1472449,879042,220539,870581) id <- c(1,2,5,30,31,34,1,2,5,1,2,5,1,2,5,30,31,34,51,52,55,81,82,85) mat1 <- matrix(area1, ncol=3, byrow=T) mat2 <- matrix(id, ncol=3, byrow=T) dimnames(mat1) <-list(NULL, c("a1","a2","a3")) mat2 contains the ids for mat1 , so the dimensions of the matrix are the same (i.e., mat1[1,1]

subset data.table keeping only elements greater than certain value applied to all columns

爷,独闯天下 提交于 2019-12-02 05:55:31
问题 I would like to subset news (below) to create news2 (further below) which will only include the rows/columns where the abs(value) in each element of news > 0.01. Below is the code that I have tried: gr <- data.frame(which(abs(news[, 1:ncol(news), with = FALSE]) > 0.01, arr.ind = TRUE)) news2a <- news[gr$row, c(1, gr$col + 1L), with = FALSE] news2a[, which(duplicated(names(news2a))) := NULL] The code above does not always work. Note: In the real data set, there are both more rows and columns.

xts subsetting gives incorrect results for months

自古美人都是妖i 提交于 2019-12-02 05:51:41
问题 I am using R 3.2.1 for Mac OS X and seem to have run into incorrect behavior in xts subsetting. In brief, subsetting monthly data give a result that is 1 month lagged from what it should be. Here is a simple example that is similar to an analysis of paleotemperature I am doing: First I will make some test data for the example: xts.test <- xts(rnorm(440*12, mean=0, sd=10),order.by=timeBasedSeq(155001/1989)) This produces a correct xts file AFAICT. Here is the first year of 12 months. head(xts