recode

Subset values with matching criteria in r

爷,独闯天下 提交于 2020-01-25 10:13:38
问题 I had a similar question here but this one is slightly different. I would like to return values with matching conditions in another column based on a cut score criterion. If the cut scores are not available in the variable, I would like to grab closest larger value for the first and second cut, and grab the closest smallest value for the third cut. Here is a snapshot of dataset: ids <- c(1,2,3,4,5,6,7,8,9,10) scores.a <- c(512,531,541,555,562,565,570,572,573,588) scores.b <- c(12,13,14,15,16

recode related values in an efficient way

。_饼干妹妹 提交于 2020-01-07 07:56:20
问题 I have a dataframe df with only one variable var with some related values. df <- data.frame(var = c(rep('AUS',12), rep('NZ',12), rep('ENG',7), rep('SOC',12), rep('PAK',11), rep('SRI',17), rep('IND',15))) df %>% count(var) # # A tibble: 7 x 2 # var n # <fctr> <int> # 1 AUS 12 # 2 ENG 7 # 3 IND 15 # 4 NZ 12 # 5 PAK 11 # 6 SOC 12 # 7 SRI 17 Based on some relations, some values should be recoded with a new value. df %>% mutate(var = recode(var, 'AUS' = 'A', 'NZ' = 'A', 'ENG' = 'A', 'SOC' = 'A',

recode related values in an efficient way

前提是你 提交于 2020-01-07 07:55:53
问题 I have a dataframe df with only one variable var with some related values. df <- data.frame(var = c(rep('AUS',12), rep('NZ',12), rep('ENG',7), rep('SOC',12), rep('PAK',11), rep('SRI',17), rep('IND',15))) df %>% count(var) # # A tibble: 7 x 2 # var n # <fctr> <int> # 1 AUS 12 # 2 ENG 7 # 3 IND 15 # 4 NZ 12 # 5 PAK 11 # 6 SOC 12 # 7 SRI 17 Based on some relations, some values should be recoded with a new value. df %>% mutate(var = recode(var, 'AUS' = 'A', 'NZ' = 'A', 'ENG' = 'A', 'SOC' = 'A',

bunch recoding of variables in the tidyverse (functional / meta-programing)

血红的双手。 提交于 2020-01-01 19:39:11
问题 I want to recode a bunch of variables with as few function calls as possible. I have one data.frame where I want to recode a number of variables. I create a named list of all variable names and the recoding arguments I want to execute. Here I have no problem using map and dpylr . However, when it comes to recoding I find it much easier using recode from the car package, instead of dpylr 's own recoding function. A side question is whether there is a nice way of doing the same thing with dplyr

R data.table multi column recode/sub-assign [duplicate]

被刻印的时光 ゝ 提交于 2019-12-30 22:54:30
问题 This question already has answers here : Fastest way to replace NAs in a large data.table (9 answers) Closed 4 years ago . Let DT be a data.table: DT<-data.table(V1=sample(10), V2=sample(10), ... V9=sample(10),) Is there a better/simpler method to do multicolumn recode/sub-assign like this: DT[V1==1 | V1==7,V1:=NA] DT[V2==1 | V2==7,V2:=NA] DT[V3==1 | V3==7,V3:=NA] DT[V4==1 | V4==7,V4:=NA] DT[V5==1 | V5==7,V5:=NA] DT[V6==1 | V6==7,V6:=NA] DT[V7==1 | V7==7,V7:=NA] DT[V8==1 | V8==7,V8:=NA] DT[V9

R data.table multi column recode/sub-assign [duplicate]

社会主义新天地 提交于 2019-12-30 22:53:35
问题 This question already has answers here : Fastest way to replace NAs in a large data.table (9 answers) Closed 4 years ago . Let DT be a data.table: DT<-data.table(V1=sample(10), V2=sample(10), ... V9=sample(10),) Is there a better/simpler method to do multicolumn recode/sub-assign like this: DT[V1==1 | V1==7,V1:=NA] DT[V2==1 | V2==7,V2:=NA] DT[V3==1 | V3==7,V3:=NA] DT[V4==1 | V4==7,V4:=NA] DT[V5==1 | V5==7,V5:=NA] DT[V6==1 | V6==7,V6:=NA] DT[V7==1 | V7==7,V7:=NA] DT[V8==1 | V8==7,V8:=NA] DT[V9

Recode multiple columns using dplyr

人走茶凉 提交于 2019-12-23 13:08:52
问题 I had a dataframe where I recoded several columns so that 999 was set to NA dfB <-dfA %>% mutate(adhere = if_else(adhere==999, as.numeric(NA), adhere)) %>% mutate(engage = if_else(engage==999, as.numeric(NA), engage)) %>% mutate(quality = if_else(quality==999, as.numeric(NA), quality)) %>% mutate(undrstnd = if_else(undrstnd==999, as.numeric(NA), undrstnd)) %>% mutate(sesspart = if_else(sesspart==999, as.numeric(NA), sesspart)) %>% mutate(attended = if_else(attended>=9, as.integer(NA),

R - Recoding column with multiple text values associated with one code

自作多情 提交于 2019-12-22 14:05:28
问题 I'm trying to recode a column to determine the shift of an employee. The data is messy and the word I am looking for must be extracted from the text. I've been trying various routes with if statements, stringr and dplyr packages, but can't figure out how to get them to work together. I have this line of code, but str_match doesn't produce a true/false value. Data$Shift <- if(str_match(Data$Unit, regex(first, ignore_case = TRUE))) { print("First Shift") } else { print("Lame") } recode is

dplyr::recode Why does pipe generate error?

扶醉桌前 提交于 2019-12-18 19:08:48
问题 If I use recode in a pipe, I get an error: df <- df %>% recode(unit, .missing="g") Error in UseMethod("recode") : no applicable method for 'recode' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')" If I pull it out of the pipe, it works fine: df$unit <- recode(df$unit, .missing="g") Any ideas why? I'd like to stay in the pipe if possible. 回答1: An equivalent of the baseR solution in dplyr is to use it inside mutate : df %>% mutate(unit = recode(unit, .missing="g")) Directly

Pandas/Python: Replace multiple values in multiple columns

青春壹個敷衍的年華 提交于 2019-12-13 02:34:51
问题 All, I have an analytical csv file with 190 columns and 902 rows. I need to recode values in several columns (18 to be exact) from it's current 1-5 Likert scaling to 0-4 Likert scaling. I've tried using replace: df.replace({'Job_Performance1': {1:0, 2:1, 3:2, 4:3, 5:4}}, inplace=True) But that throws a Value Error: "Replacement not allowed with overlapping keys and values" I can use map: df['job_perf1'] = df.Job_Performance1.map({1:0, 2:1, 3:2, 4:3, 5:4}) But, I know there has to be a more