Filtering a data frame

后端未结

关注

 3  1653

I have read in a csv file in matrix form (having m rows and n columns). I want to filter the matrix by conducting a filter in verbal form:

Select all values from co

相关标签:

3条回答

野的像风

2020-12-29 13:10

Assuming that dat is the data frame in question, col is the name of the column and "value" is the value that you want, you can do

dat[dat$col=="value",]

That fetches all of the rows of dat for which dat$col=="value", and all of the columns.

0 讨论(0)
发布评论:

提交评论
- 加载中...

星月不相逢

2020-12-29 13:20

You said you just wanted the column x values where column_values was 15, right?

subset(dat, column_values==15, select=x)

I think this may come as a dataframe so it's possble you may need to unlist() it and maybe even "unfactor" it.

> dat
  Subject Product
1       1   ProdA
2       1   ProdB
3       1   ProdC
4       2   ProdB
5       2   ProdC
6       2   ProdD
7       3   ProdA
8       3   ProdB
> subset(dat, Subject==2, Product)
  Product
4   ProdB
5   ProdC
6   ProdD
> unlist( subset(dat, Subject==2, Product) )
Product1 Product2 Product3 
   ProdB    ProdC    ProdD 
Levels: ProdA ProdB ProdC ProdD
> as.character( unlist( subset(dat, Subject==2, Product) ) )
[1] "ProdB" "ProdC" "ProdD"

If you want all of the columns you can drop the third argument (the select= argument):

subset(dat, Subject==2 )

  Subject Product
4       2   ProdB
5       2   ProdC
6       2   ProdD

0 讨论(0)

你的背包

2020-12-29 13:20
First, note that a matrix and a data.frame are different things in R. I imagine you have a data.frame (as that is what is returned by read.csv()). data.frame's have named columns (if you don't give them ones, generic ones are created for you).

You can subset a data.frame by indicating both what rows you want and/or what columns you want. The easiest way to specify which rows is with a logical vector, often built out of comparisons using specific columns of the data.frame. For example data[["column values"]] == "15" would make a logical vector which is TRUE if the corresponding entry in the column column values is the string "15" (since it is in quotes, it is a string, not a number). You can make as complicated a selection criteria as you like (combining logical vectors with & and |) to specify the rows you want. This vector becomes the first argument in the indexing.

A list of column names or numbers can be the second argument. If either argument is missing, all rows (or columns) are assumed.

Putting this all together, you get examples like
```
data[data[["column values"]] == "15", ]
```
or using an actual data set (mtcars)
```
mtcars[mtcars$am == 1, ]
mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg"]
mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg", drop=FALSE]
mtcars[mtcars$hp > 100, c("mpg", "carb")]
```
Take a look at what each of the conditionals (first arguments, e.g. mtcars$am == 1 & mtcars$hp > 100) return to get a better sense of how indexing works.
0 讨论(0)
发布评论:

提交评论
- 加载中...