subset | 易学教程

Collapse rows with overlapping ranges

阅读更多关于 Collapse rows with overlapping ranges

问题 I have a data.frame with start and end time: ranges<- data.frame(start = c(65.72000,65.72187, 65.94312,73.75625,89.61625),stop = c(79.72187,79.72375,79.94312,87.75625,104.94062)) > ranges start stop 1 65.72000 79.72187 2 65.72187 79.72375 3 65.94312 79.94312 4 73.75625 87.75625 5 89.61625 104.94062 In this example, the ranges in row 2 and 3 are entirely within the range between 'start' on row 1 and stop on row 4. Thus, the overlapping ranges 1-4 should be collapsed to one range: > ranges

How to define the subset operators for a S4 class?

阅读更多关于 How to define the subset operators for a S4 class?

问题 I am having trouble figuring out the proper way to define the [ , $ , and [[ subset operators for an S4 class. Can anyone provide me with a basic example of defining these three for an S4 class? 回答1: Discover the generic so that we know what we are aiming for > getGeneric("[") standardGeneric for "[" defined from package "base" function (x, i, j, ..., drop = TRUE) standardGeneric("[", .Primitive("[")) <bytecode: 0x32e25c8> <environment: 0x32d7a50> Methods may be defined for arguments: x, i, j

How to replace NA with mean by subset in R (impute with plyr?)

阅读更多关于 How to replace NA with mean by subset in R (impute with plyr?)

I have a dataframe with the lengths and widths of various arthropods from the guts of salamanders. Because some guts had thousands of certain prey items, I only measured a subset of each prey type. I now want to replace each unmeasured individual with the mean length and width for that prey. I want to keep the dataframe and just add imputed columns (length2, width2). The main reason is that each row also has columns with data on the date and location the salamander was collected. I could fill in the NA with a random selection of the measured individuals but for the sake of argument let's

Delete duplicate rows in two columns simultaneously [duplicate]

阅读更多关于 Delete duplicate rows in two columns simultaneously [duplicate]

问题 This question already has answers here : duplicates in multiple columns (2 answers) Closed 3 years ago . I would like to delete duplicate rows based in two collumns, instead just one. My input df : RAW.PVAL GR allrl Bak 0.05 fr EN1 B12 0.05 fg EN1 B11 0.45 fr EN2 B10 0.35 fg EN2 B066 My output: RAW.PVAL GR allrl Bak 0.05 fr EN1 B12 0.45 fg EN2 B10 0.35 fg EN2 B066 I had tried df<- subset(df, !duplicated(allrl, RAW.PVAL)) , but I do not work to delete rows with this two columns simultaneously

Return df with a columns values that occur more than once [duplicate]

阅读更多关于 Return df with a columns values that occur more than once [duplicate]

问题 This question already has an answer here: Subset data frame based on number of rows per group 3 answers I have a data frame df, and I am trying to subset all rows that have a value in column B occur more than once in the dataset. I tried using table to do it, but am having trouble subsetting from the table: t<-table(df$B) Then I try subsetting it using: subset(df, table(df$B)>1) And I get the error "Error in x[subset & !is.na(subset)] : object of type 'closure' is not subsettable" How can I

Keeping only certain rows of a data frame based on a set of values

阅读更多关于 Keeping only certain rows of a data frame based on a set of values

问题 I have a data frame with an ID column and a few columns for values. I would like to only keep certain rows of the data frame based on whether or not the value of ID at that row matches another set of values (for instance, called "keep"). For simplicity, here is an example: df <- data.frame(ID = sample(rep(letters, each=3)), value = rnorm(n=26*3)) keep <- c("a", "d", "r", "x") How can I create a new data frame consisting of rows that only have IDs that match those of keep? I can do this for

from data table, randomly select one row per group

阅读更多关于 from data table, randomly select one row per group

问题 I'm looking for an efficient way to select rows from a data table such that I have one representative row for each unique value in a particular column. Let me propose a simple example: require(data.table) y = c('a','b','c','d','e','f','g','h') x = sample(2:10,8,replace = TRUE) z = rep(y,x) dt = as.data.table( z ) my objective is to subset data table dt by sampling one row for each letter a-h in column z. 回答1: OP provided only a single column in the example. Assuming that there are multiple

How to pass “nothing” as an argument to `[` for subsetting?

阅读更多关于 How to pass “nothing” as an argument to `[` for subsetting?

问题 I was hoping to be able to construct a do.call formula for subsetting without having to identify the actual range of every dimension in the input array. The problem I'm running into is that I can't figure out how to mimic the direct function x[,,1:n,] , where no entry in the other dimensions means "grab all elements." Here's some sample code, which fails. So far as I can tell, either [ or do.call replaces my NULL list values with 1 for the index. x<-array(1:6,c(2,3)) dimlist<-vector('list',

Subset data.table by logical column

阅读更多关于 Subset data.table by logical column

问题 I have a data.table with a logical column. Why the name of the logical column can not be used directly for the i argument? See the example. dt <- data.table(x = c(T, T, F, T), y = 1:4) # Works dt[dt$x] dt[!dt$x] # Works dt[x == T] dt[x == F] # Does not work dt[x] dt[!x] 回答1: From ?data.table Advanced: When i is a single variable name, it is not considered an expression of column names and is instead evaluated in calling scope. So dt[x] will try to evaluate x in the calling scope (in this case

Set global object in Shiny

阅读更多关于 Set global object in Shiny

问题 Let's say I have the following server.R file in shiny: shinyServer(function(input, output) { output$plot <- renderPlot({ data2 <- data[data$x == input$z, ] # subsetting large dataframe plot(data2$x, data2$y) }) output$table <- renderTable({ data2 <- data[data$x == input$z, ] # same subset. Oh, boy... summary(data2$x) }) }) What can I do in order to not have to run data2 <- data[data$x == input$z, ] within every render call? If I do the following, I get a "object of type 'closure' is not