I have read in a csv file in matrix form (having m rows and n columns). I want to filter the matrix by conducting a filter in verbal form:
Select all values from co
Assuming that dat
is the data frame in question, col
is the name of the column and "value"
is the value that you want, you can do
dat[dat$col=="value",]
That fetches all of the rows of dat
for which dat$col=="value"
, and all of the columns.
You said you just wanted the column x values where column_values was 15, right?
subset(dat, column_values==15, select=x)
I think this may come as a dataframe so it's possble you may need to unlist() it and maybe even "unfactor" it.
> dat
Subject Product
1 1 ProdA
2 1 ProdB
3 1 ProdC
4 2 ProdB
5 2 ProdC
6 2 ProdD
7 3 ProdA
8 3 ProdB
> subset(dat, Subject==2, Product)
Product
4 ProdB
5 ProdC
6 ProdD
> unlist( subset(dat, Subject==2, Product) )
Product1 Product2 Product3
ProdB ProdC ProdD
Levels: ProdA ProdB ProdC ProdD
> as.character( unlist( subset(dat, Subject==2, Product) ) )
[1] "ProdB" "ProdC" "ProdD"
If you want all of the columns you can drop the third argument (the select= argument):
subset(dat, Subject==2 )
Subject Product
4 2 ProdB
5 2 ProdC
6 2 ProdD
First, note that a matrix
and a data.frame
are different things in R. I imagine you have a data.frame
(as that is what is returned by read.csv()
). data.frame
's have named columns (if you don't give them ones, generic ones are created for you).
You can subset a data.frame
by indicating both what rows you want and/or what columns you want. The easiest way to specify which rows is with a logical vector, often built out of comparisons using specific columns of the data.frame
. For example data[["column values"]] == "15"
would make a logical vector which is TRUE
if the corresponding entry in the column column values
is the string "15" (since it is in quotes, it is a string, not a number). You can make as complicated a selection criteria as you like (combining logical vectors with &
and |
) to specify the rows you want. This vector becomes the first argument in the indexing.
A list of column names or numbers can be the second argument. If either argument is missing, all rows (or columns) are assumed.
Putting this all together, you get examples like
data[data[["column values"]] == "15", ]
or using an actual data set (mtcars
)
mtcars[mtcars$am == 1, ]
mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg"]
mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg", drop=FALSE]
mtcars[mtcars$hp > 100, c("mpg", "carb")]
Take a look at what each of the conditionals (first arguments, e.g. mtcars$am == 1 & mtcars$hp > 100
) return to get a better sense of how indexing works.