in R, how to set and retain custom levels in factor with different labels?

邮差的信 提交于 2021-01-28 08:30:54

问题


in R, how to set and retain custom levels in factor with different labels ?

That is, I want to set custom numbers in the levels of a factor, and these numerical values - integers to be retained and not converted to "1, 2, 3 etc.".

I know that one solution is to set these weights as Labels, but then I will missing the "labels" of the factor.

The "weighted" distance between factors is not retained. Is it possible in R, to achieve something like this, using a single variable ?

For example:

age_f <- factor( c(1, 10, 100), levels = c( 1, 10, 100 ), labels = c( "baby", "child", "old" ), ordered = T )
levels(age_f)
   [1] "baby"  "child" "old"  
labels(age_f)
   [1] "1" "2" "3"
 labels(levels(age_f))
    [1] "1" "2" "3"
 as.numeric(age_f)
    [1] 1 2 3

Desired output: 
 as.numeric(age_f)
        [1] 1 10 100

If this does not exists in R factors, it is easy to produce such result by a custom function?


回答1:


You could use the labelled package for this.

library(labelled)
labelled(c(1, 10, 100), c(baby = 1, child = 10 , old = 100))

<Labelled double>
[1]   1  10 100

Labels:
 value label
     1  baby
    10 child
   100   old

If you later want to convert it into a regular factor you can use to_factor.




回答2:


I find a work around in order to retain the levels of a factor with the custom values that I assigned to them:

The workaround is to "paste" the levels of factor to the labels of factors, and then with a function to separate them into two different df.

This will be equivlant of creating from the begining two different datasets / dataframes, one with the labels of factors, and another with their corresponding levels.

However, this may not so practical if you want to not set "two times" your variables.

Therefore, I believe, it adds clarity in manipulating factors. You have all the neccessary info into one place. If you have the need, you can separate them by creating two different dfs.

# Example Factor: 
age_f <- factor( ordered( 1:3 ), labels = c( "1 Infant", "10 Child", "100 Old" ) )
# The Function
Leveling_Labels <- function( factors, split_arg = " " ) { 

  leveling_Labels <- list()

  for( i in 1:length( factors ) )  { 

    splits                  <- strsplit( as.character( factors[[i]] ), split_arg )
    leveling_Labels[[i]]    <- as.numeric( unlist( lapply( 1:length( splits ), function(x) splits[[x]][1] ) ) )
    levels( factors[[i]] )  <- unlist( lapply( 1:length( splits ), function(x) splits[[x]][2] ) )

  }

  results <- c( factors, leveling_Labels )
  results

}
  • The Factor that was made:

    age_f

 [1] 1 Infant 10 Child 100 Old 
    Levels: 1 Infant < 10 Child < 100 Old
  • Running the function which separates the factor from its levels:

Leveling_Labels( list( age_f ), " ")

[[1]]
[1] Infant Child  Old   
Levels: Infant < Child < Old

[[2]]
[1]   1  10 100
  • You may have as a list, a number of factors

p.s. Do you know where I can contribute self-made R-functions like that? Do you know packages that are open to collaboration or searching actively for contributers in naive level?



来源:https://stackoverflow.com/questions/54850701/in-r-how-to-set-and-retain-custom-levels-in-factor-with-different-labels

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!