subset | 易学教程

How to subset my data with eliminating repeated observations

阅读更多关于 How to subset my data with eliminating repeated observations

问题 How can I erase repeated observations of IGM? I want to make following data as one IGM per one county. I tried data$GM[data$county] But it didn't work, because I need a row number inside [], not a county number. How can I match one GM per one county? To be clear, I want to make this data county cd110 repvote state GM gini 2 1001 102 1 Alabama 38.4 0.381 3 1001 102 1 Alabama 38.4 0.381 4 1003 101 0 Alabama 39.6 0.491 5 1003 101 0 Alabama 39.6 0.491 9 1003 101 0 Alabama 39.6 0.491 13 1003 101 1

Subset a dataframes in a list based on the content of a vector

阅读更多关于 Subset a dataframes in a list based on the content of a vector

问题 I have a list of five dataframes. Each dataframe contains one dimension column and 4 value columns. I would like to subset each dataframe in the list based on the contents of a vector. df <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100)) df2 <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 = runif(100, 0, 100), y4 = runif(100, 0, 100), y5 = runif(100,0,100)) df3 <- data.frame(x = 1:100, y2 = runif(100, 0, 100), y3 =

Why subset won't work in R?

阅读更多关于 Why subset won't work in R?

问题 I have a data frame 'vissim' which has following structure: > str(vissim) 'data.frame': 480 obs. of 12 variables: $ Measur. : int 1 2 3 4 5 6 7 8 9 10 ... $ from : int 100 100 100 100 100 100 100 100 100 100 ... $ to : int 130 130 130 130 130 130 130 130 130 130 ... $ Occup..Rate.Trucks : num 2.9 NA NA NA NA 1 NA NA NA NA ... $ Speed.Mean.Trucks : num 51.4 NA NA NA NA 50.7 NA NA NA NA ... $ Number.Veh.Trucks : int 2 NA NA NA NA 1 NA NA NA NA ... $ Occup..Rate.Motorcycles: num NA 0.7 NA NA NA

How to grep a group based on string in another column that doesn't occur in each observation using R?

阅读更多关于 How to grep a group based on string in another column that doesn't occur in each observation using R?

问题 Have to simplify a previous question that failed. I want to extract whole groups, identified by 'id', that contain a string ('inter' or 'high') in another column called 'strmatch'. The string doesn't occurr in every observation of the group, but if it occurs I want to assign the group to a respective data frame. The data frame df <- data.frame(id = c("a", "a", "b", "b","c", "c","d","d"), std = c("y", "y","n","n","y","y","n","n"), strmatch = c("alpha","TMB-inter","beta","TMB-high","gamma",

Getting Median of a Column where value of another Column is 1 in R

阅读更多关于 Getting Median of a Column where value of another Column is 1 in R

问题 Ok so I have a csv file similar to this structure hashID,value,flag 98fafd, 35, 1 fh56w2, 25, 0 ggjeas, 55, 1 adfh5d, 45, 0 Basically what I want to do is get the median of the value column but only include rows where flag==1 in the calculation. Is this even possible in R? I've searched around and haven't found anything like this. 回答1: Here is one possibility: Read your data set using the following command: newdata <- read.csv("stackoverflow questions/mediancol.csv") # I assume you have the

R subsetting by partially matching row name

阅读更多关于 R subsetting by partially matching row name

问题 I have a tab delimited file: row.names c1 c2 c3 AF3 0 2 4 BN4 9 1 2 AF2 8 7 1 BN8 4 6 8 And I want to select only the rows with row names beginning with BN4, output would be like: row.names c1 c2 c3 BN4 9 1 2 BN8 4 6 8 I know how I would solve the problem if I knew the exact row names in a vector... df[row.names(df) %in% c('BN4','BN8'), ] But how would I solve the problem by finding and subsetting on the rows that start with 'BN'? 回答1: You can use grep to find those rows whose names start

R - filter coordinates

阅读更多关于 R - filter coordinates

问题 I am new to R and I have a simple problem (by my opinion) but I haven't found a solution so far. I have a (long) set of 2D (x,y) coordinates - just points in 2D space, like this: ID x y 1 1758.56 1179.26 2 775.67 1197.14 3 296.99 1211.13 4 774.72 1223.66 5 805.41 1235.51 6 440.67 1247.59 7 1302.02 1247.93 8 1450.4 1259.13 9 664.99 1265.9 10 2781.05 1291.12 etc..... How do I filter points (rows in the table) that are in certain area (of any shape!)? How to filter dots that are within a subset

How to “extract” Z from subset type {z : Z | z > 0}

阅读更多关于 How to “extract” Z from subset type {z : Z | z > 0}

问题 If a function take Z as arguments, it should also be possible to take any subset of Z , right? For example, Zmod takes two Z and return Z . Can I improve on this method with subset types without reimplementing it? I want this: Definition Z_gt0 := {z | z > 0}. Definition mymod (n1 n2 : Z_gt0) := Zmod n1 n2. But Coq complains that n1 is expected to have type Z , of course. How can I make it work with Z_gt0 ? Coerce? This question is related to my other one here: Random nat stream and subset

How to find all possible pairs from three subsets of a set with constraints in Erlang?

阅读更多关于 How to find all possible pairs from three subsets of a set with constraints in Erlang?

问题 I have a set M which consists of three subsets A,B and C. Problem: I would like to calculate all possible subsets S(1)...S(N) of M which contain all possible pairs between elements of A, B and C in such manner that: elements of A and B can happen in a pair only once for each of two positions in a pair (that is {a1,a2} and {b1,a1} can be in one subset S, but no more elements {a1,_} and {_,a1} are allowed in this subset S); elements of C can happen 1-N times in a subset S (that is {a,c}, {b,c},

Subset dataframe based on column value using a function in R

阅读更多关于 Subset dataframe based on column value using a function in R

问题 I am trying to subset a dataframe with the following function. SubsetDF <- function(DF, VAR, YEAR){ DF2 <- DF[DF$VAR <= YEAR, ] return(DF2) } test <- SubsetDF(myData, "YEAR", 2000) The resulting "test" is empty. What am I missing here? By the way, if I just do below, then the resulting dataframe is fine. myData[myData$YEAR <= 2010,] 回答1: Try SubsetDF <- function(DF, VAR, YEAR){ DF2 <- DF[DF[VAR] <= YEAR, ] return(DF2) } test <- SubsetDF(myData, "YEAR", 2000) Just replace the DF$VAR part with