error in running factor() on a column of a data frame

后端 未结 1 1209

I have a dataframe which has several columns. I want to run the factor() function on one of the columns, say name my_col. Initially I did it this way

df[,\"         


        
1条回答
  •  再見小時候
    2020-12-11 22:38

    Your data is a tbl_df. I don't have your data, but we can look at an example using mtcars.

    library(dplyr)
    
    tbl_df(mtcars)[, "mpg"]
    # Source: local data frame [32 x 1]
    # 
    #      mpg
    #    (dbl)
    # 1   21.0
    # 2   21.0
    # 3   22.8
    # 4   21.4
    # 5   18.7
    # 6   18.1
    # 7   14.3
    # 8   24.4
    # 9   22.8
    # 10  19.2
    # ..   ...
    

    It's still a data frame, whereas in base R it would have been dropped to an atomic vector. dplyr:::`[.tbl_df` does not drop single columns, as is done in [.data.frame from base R. This is why we can't run factor() on it.

    factor(tbl_df(mtcars)[, "mpg"])
    # Error in sort.list(y) : 'x' must be atomic for 'sort.list'
    # Have you called 'sort' on a list?
    

    So you'll need to use [[, as in df[["my_col"]], or just use $.

    df[["my_col"]] <- factor(df[["my_col"]])
    

    Note: When you use the $ operator you can do it without the quotes around the column name.

    df$my_col <- factor(df$my_col)
    

    0 讨论(0)
提交回复
热议问题