subset | 易学教程

Select rows from a data frame based on values in a vector

阅读更多关于 Select rows from a data frame based on values in a vector

问题 I have data similar to this: dt <- structure(list(fct = structure(c(1L, 2L, 3L, 4L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 4L), .Label = c(\"a\", \"b\", \"c\", \"d\"), class = \"factor\"), X = c(2L, 4L, 3L, 2L, 5L, 4L, 7L, 2L, 9L, 1L, 4L, 2L, 5L, 4L, 2L)), .Names = c(\"fct\", \"X\"), class = \"data.frame\", row.names = c(NA, -15L)) I want to select rows from this data frame based on the values in the fct variable. For example, if I wish to select rows containing either \"a\" or \"c\" I can

Why is `[` better than `subset`?

阅读更多关于 Why is `[` better than `subset`?

问题 When I need to filter a data.frame, i.e., extract rows that meet certain conditions, I prefer to use the subset function: subset(airquality, Month == 8 & Temp > 90) Rather than the [ function: airquality[airquality$Month == 8 & airquality$Temp > 90, ] There are two main reasons for my preference: I find the code reads better, from left to right. Even people who know nothing about R could tell what the subset statement above is doing. Because columns can be referred to as variables in the

Filter data.frame rows by a logical condition

阅读更多关于 Filter data.frame rows by a logical condition

问题 I want to filter rows from a data.frame based on a logical condition. Let\'s suppose that I have data frame like expr_value cell_type 1 5.345618 bj fibroblast 2 5.195871 bj fibroblast 3 5.247274 bj fibroblast 4 5.929771 hesc 5 5.873096 hesc 6 5.665857 hesc 7 6.791656 hips 8 7.133673 hips 9 7.574058 hips 10 7.208041 hips 11 7.402100 hips 12 7.167792 hips 13 7.156971 hips 14 7.197543 hips 15 7.035404 hips 16 7.269474 hips 17 6.715059 hips 18 7.434339 hips 19 6.997586 hips 20 7.619770 hips 21 7

Subset dataframe by multiple logical conditions of rows to remove

阅读更多关于 Subset dataframe by multiple logical conditions of rows to remove

I would like to subset (filter) a dataframe by specifying which rows not ( ! ) to keep in the new dataframe. Here is a simplified sample dataframe: data v1 v2 v3 v4 a v d c a v d d b n p g b d d h c k d c c r p g d v d x d v d c e v d b e v d c For example, if a row of column v1 has a "b", "d", or "e", I want to get rid of that row of observations, producing the following dataframe: v1 v2 v3 v4 a v d c a v d d c k d c c r p g I have been successful at subsetting based on one condition at a time. For example, here I remove rows where v1 contains a "b": sub.data <- data[data[ , 1] != "b", ]