Inquiring a better way to write code in R

拥有回忆 提交于 2020-01-06 11:22:48

问题


I am new to R, and I'd like help in finding a better way to write the following code I've written. Any help would be appreciated.

df$rank[between(df$score,0,1.2)] <- 1
df$rank[between(df$score,1.2,2.1)] <- 2
df$rank[between(df$score,2.1,2.9)] <- 3
df$rank[between(df$score,2.9,3.7)] <- 4
df$rank[between(df$score,3.7,4.5)] <- 5
df$rank[between(df$score,4.5,5.4)] <- 6

回答1:


You can use cut:

df$rank <- cut(x = df$score,c(0,1.2,2.1,2.9,3.7,4.5,5.4,Inf),FALSE)



回答2:


library(dplyr)

set.seed(1234)
df <- data.frame(rank  = rep(0, 15),
                 score = runif(15, 0, 6))
df

#>    rank      score
#> 1     0 0.68222047
#> 2     0 3.73379643
#> 3     0 3.65564840
#> 4     0 3.74027665
#> 5     0 5.16549230
#> 6     0 3.84186363
#> 7     0 0.05697454
#> 8     0 1.39530304
#> 9     0 3.99650255
#> 10    0 3.08550685
#> 11    0 4.16154775
#> 12    0 3.26984901
#> 13    0 1.69640150
#> 14    0 5.54060091
#> 15    0 1.75389504

df %>% 
  mutate(rank = case_when(between(score,   0, 1.2) ~ 1,
                          between(score, 1.2, 2.1) ~ 2,
                          between(score, 2.1, 2.9) ~ 3,
                          between(score, 2.9, 3.7) ~ 4,
                          between(score, 3.7, 4.5) ~ 5,
                          between(score, 4.5, 5.4) ~ 6))
#>    rank      score
#> 1     1 0.68222047
#> 2     5 3.73379643
#> 3     4 3.65564840
#> 4     5 3.74027665
#> 5     6 5.16549230
#> 6     5 3.84186363
#> 7     1 0.05697454
#> 8     2 1.39530304
#> 9     5 3.99650255
#> 10    4 3.08550685
#> 11    5 4.16154775
#> 12    4 3.26984901
#> 13    2 1.69640150
#> 14   NA 5.54060091
#> 15    2 1.75389504

Created on 2018-04-29 by the reprex package (v0.2.0).




回答3:


As you didn't add a reproducible example, I created a little one (but keep in mind you should always add an example).

Using ifelse from base you could do this way:

df = data.table(rank = c(1.2, 3.3, 2.5, 3.7, 5.8, 6, 3, 1.1, 0.5))
df$rank2 = ifelse(df$rank>0 & df$rank<=1.2, 1, 
             ifelse(df$rank>1.2 & df$rank<=2.1, 2, 
                    ifelse(df$rank>2.1 & df$rank<=2.9, 3, 
                           ifelse(df$rank>2.9 & df$rank<=3.7, 4, 
                                  ifelse(df$rank>3.7 & df$rank<=4.5, 5, 6)))))

The last ifelse should be your maximun rank value, so the "no" argument will be the last range.

If this is a reocurring problem you should create a function.

Hope it helps.



来源:https://stackoverflow.com/questions/50089955/inquiring-a-better-way-to-write-code-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!