subset

Collapse rows with overlapping ranges

被刻印的时光 ゝ 提交于 2019-11-26 17:47:42
问题 I have a data.frame with start and end time: ranges<- data.frame(start = c(65.72000,65.72187, 65.94312,73.75625,89.61625),stop = c(79.72187,79.72375,79.94312,87.75625,104.94062)) > ranges start stop 1 65.72000 79.72187 2 65.72187 79.72375 3 65.94312 79.94312 4 73.75625 87.75625 5 89.61625 104.94062 In this example, the ranges in row 2 and 3 are entirely within the range between 'start' on row 1 and stop on row 4. Thus, the overlapping ranges 1-4 should be collapsed to one range: > ranges

How to define the subset operators for a S4 class?

旧街凉风 提交于 2019-11-26 17:37:45
问题 I am having trouble figuring out the proper way to define the [ , $ , and [[ subset operators for an S4 class. Can anyone provide me with a basic example of defining these three for an S4 class? 回答1: Discover the generic so that we know what we are aiming for > getGeneric("[") standardGeneric for "[" defined from package "base" function (x, i, j, ..., drop = TRUE) standardGeneric("[", .Primitive("[")) <bytecode: 0x32e25c8> <environment: 0x32d7a50> Methods may be defined for arguments: x, i, j

How to replace NA with mean by subset in R (impute with plyr?)

南笙酒味 提交于 2019-11-26 17:21:14
I have a dataframe with the lengths and widths of various arthropods from the guts of salamanders. Because some guts had thousands of certain prey items, I only measured a subset of each prey type. I now want to replace each unmeasured individual with the mean length and width for that prey. I want to keep the dataframe and just add imputed columns (length2, width2). The main reason is that each row also has columns with data on the date and location the salamander was collected. I could fill in the NA with a random selection of the measured individuals but for the sake of argument let's

Delete duplicate rows in two columns simultaneously [duplicate]

ぃ、小莉子 提交于 2019-11-26 17:16:17
问题 This question already has answers here : duplicates in multiple columns (2 answers) Closed 3 years ago . I would like to delete duplicate rows based in two collumns, instead just one. My input df : RAW.PVAL GR allrl Bak 0.05 fr EN1 B12 0.05 fg EN1 B11 0.45 fr EN2 B10 0.35 fg EN2 B066 My output: RAW.PVAL GR allrl Bak 0.05 fr EN1 B12 0.45 fg EN2 B10 0.35 fg EN2 B066 I had tried df<- subset(df, !duplicated(allrl, RAW.PVAL)) , but I do not work to delete rows with this two columns simultaneously

Return df with a columns values that occur more than once [duplicate]

时光毁灭记忆、已成空白 提交于 2019-11-26 17:16:03
问题 This question already has an answer here: Subset data frame based on number of rows per group 3 answers I have a data frame df, and I am trying to subset all rows that have a value in column B occur more than once in the dataset. I tried using table to do it, but am having trouble subsetting from the table: t<-table(df$B) Then I try subsetting it using: subset(df, table(df$B)>1) And I get the error "Error in x[subset & !is.na(subset)] : object of type 'closure' is not subsettable" How can I

Keeping only certain rows of a data frame based on a set of values

青春壹個敷衍的年華 提交于 2019-11-26 17:03:53
问题 I have a data frame with an ID column and a few columns for values. I would like to only keep certain rows of the data frame based on whether or not the value of ID at that row matches another set of values (for instance, called "keep"). For simplicity, here is an example: df <- data.frame(ID = sample(rep(letters, each=3)), value = rnorm(n=26*3)) keep <- c("a", "d", "r", "x") How can I create a new data frame consisting of rows that only have IDs that match those of keep? I can do this for

from data table, randomly select one row per group

≯℡__Kan透↙ 提交于 2019-11-26 16:56:05
问题 I'm looking for an efficient way to select rows from a data table such that I have one representative row for each unique value in a particular column. Let me propose a simple example: require(data.table) y = c('a','b','c','d','e','f','g','h') x = sample(2:10,8,replace = TRUE) z = rep(y,x) dt = as.data.table( z ) my objective is to subset data table dt by sampling one row for each letter a-h in column z. 回答1: OP provided only a single column in the example. Assuming that there are multiple

How to pass “nothing” as an argument to `[` for subsetting?

眉间皱痕 提交于 2019-11-26 16:43:25
问题 I was hoping to be able to construct a do.call formula for subsetting without having to identify the actual range of every dimension in the input array. The problem I'm running into is that I can't figure out how to mimic the direct function x[,,1:n,] , where no entry in the other dimensions means "grab all elements." Here's some sample code, which fails. So far as I can tell, either [ or do.call replaces my NULL list values with 1 for the index. x<-array(1:6,c(2,3)) dimlist<-vector('list',

Subset data.table by logical column

雨燕双飞 提交于 2019-11-26 16:39:51
问题 I have a data.table with a logical column. Why the name of the logical column can not be used directly for the i argument? See the example. dt <- data.table(x = c(T, T, F, T), y = 1:4) # Works dt[dt$x] dt[!dt$x] # Works dt[x == T] dt[x == F] # Does not work dt[x] dt[!x] 回答1: From ?data.table Advanced: When i is a single variable name, it is not considered an expression of column names and is instead evaluated in calling scope. So dt[x] will try to evaluate x in the calling scope (in this case

Set global object in Shiny

感情迁移 提交于 2019-11-26 16:39:11
问题 Let's say I have the following server.R file in shiny: shinyServer(function(input, output) { output$plot <- renderPlot({ data2 <- data[data$x == input$z, ] # subsetting large dataframe plot(data2$x, data2$y) }) output$table <- renderTable({ data2 <- data[data$x == input$z, ] # same subset. Oh, boy... summary(data2$x) }) }) What can I do in order to not have to run data2 <- data[data$x == input$z, ] within every render call? If I do the following, I get a "object of type 'closure' is not