Filtering a data frame

后端 未结 3 1649
我寻月下人不归
我寻月下人不归 2020-12-29 12:36

I have read in a csv file in matrix form (having m rows and n columns). I want to filter the matrix by conducting a filter in verbal form:

Select all values from co

相关标签:
3条回答
  • 2020-12-29 13:10

    Assuming that dat is the data frame in question, col is the name of the column and "value" is the value that you want, you can do

    dat[dat$col=="value",]

    That fetches all of the rows of dat for which dat$col=="value", and all of the columns.

    0 讨论(0)
  • 2020-12-29 13:20

    You said you just wanted the column x values where column_values was 15, right?

    subset(dat, column_values==15, select=x)
    

    I think this may come as a dataframe so it's possble you may need to unlist() it and maybe even "unfactor" it.

    > dat
      Subject Product
    1       1   ProdA
    2       1   ProdB
    3       1   ProdC
    4       2   ProdB
    5       2   ProdC
    6       2   ProdD
    7       3   ProdA
    8       3   ProdB
    > subset(dat, Subject==2, Product)
      Product
    4   ProdB
    5   ProdC
    6   ProdD
    > unlist( subset(dat, Subject==2, Product) )
    Product1 Product2 Product3 
       ProdB    ProdC    ProdD 
    Levels: ProdA ProdB ProdC ProdD
    > as.character( unlist( subset(dat, Subject==2, Product) ) )
    [1] "ProdB" "ProdC" "ProdD"
    

    If you want all of the columns you can drop the third argument (the select= argument):

    subset(dat, Subject==2 )
    
      Subject Product
    4       2   ProdB
    5       2   ProdC
    6       2   ProdD
    
    0 讨论(0)
  • 2020-12-29 13:20

    First, note that a matrix and a data.frame are different things in R. I imagine you have a data.frame (as that is what is returned by read.csv()). data.frame's have named columns (if you don't give them ones, generic ones are created for you).

    You can subset a data.frame by indicating both what rows you want and/or what columns you want. The easiest way to specify which rows is with a logical vector, often built out of comparisons using specific columns of the data.frame. For example data[["column values"]] == "15" would make a logical vector which is TRUE if the corresponding entry in the column column values is the string "15" (since it is in quotes, it is a string, not a number). You can make as complicated a selection criteria as you like (combining logical vectors with & and |) to specify the rows you want. This vector becomes the first argument in the indexing.

    A list of column names or numbers can be the second argument. If either argument is missing, all rows (or columns) are assumed.

    Putting this all together, you get examples like

    data[data[["column values"]] == "15", ]
    

    or using an actual data set (mtcars)

    mtcars[mtcars$am == 1, ]
    mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg"]
    mtcars[mtcars$am == 1 & mtcars$hp > 100, "mpg", drop=FALSE]
    mtcars[mtcars$hp > 100, c("mpg", "carb")]
    

    Take a look at what each of the conditionals (first arguments, e.g. mtcars$am == 1 & mtcars$hp > 100) return to get a better sense of how indexing works.

    0 讨论(0)
提交回复
热议问题