na

R data.table multi column recode/sub-assign [duplicate]

大兔子大兔子 提交于 2019-12-01 20:02:38
This question already has an answer here: Fastest way to replace NAs in a large data.table 8 answers Let DT be a data.table: DT<-data.table(V1=sample(10), V2=sample(10), ... V9=sample(10),) Is there a better/simpler method to do multicolumn recode/sub-assign like this: DT[V1==1 | V1==7,V1:=NA] DT[V2==1 | V2==7,V2:=NA] DT[V3==1 | V3==7,V3:=NA] DT[V4==1 | V4==7,V4:=NA] DT[V5==1 | V5==7,V5:=NA] DT[V6==1 | V6==7,V6:=NA] DT[V7==1 | V7==7,V7:=NA] DT[V8==1 | V8==7,V8:=NA] DT[V9==1 | V9==7,V9:=NA] Variable names are completely arbitrary and do not necessarily have numbers. Many columns (Vx:Vx) and one

Replace NA with last non-NA in data.table by using only data.table

淺唱寂寞╮ 提交于 2019-12-01 19:35:15
I want to replace NA values with last non-NA values in data.table and using data.table . I have one solution, but it's considerably slower than na.locf : library(data.table) library(zoo) library(microbenchmark) f1 <- function(x) { x[, X := na.locf(X, na.rm = F)] x } f2 <- function(x) { cond <- !is.na(x[, X]) x[, X := .SD[, X][1L], by = cumsum(cond)] x } m1 <- data.table(X = rep(c(NA,NA,1,2,NA,NA,NA,6,7,8), 100)) m2 <- data.table(X = rep(c(NA,NA,1,2,NA,NA,NA,6,7,8), 100)) microbenchmark(f1(m1), f2(m2), times = 10) #Unit: milliseconds # expr min lq median uq max neval # f1(m1) 2.648938 2.770792

Replace 0s with NA in tables

為{幸葍}努か 提交于 2019-12-01 16:46:14
I generally work with dataframes and could easily do this for a data frame but on my current project I have the need to replace all zeros with NAs in a table structure. For the following two tables (one using table and the other using ftable) how could I replace all zero counts with NA? x <- with(mtcars,table(am, gear, cyl, vs)) x2 <- with(mtcars,ftable(am, gear, cyl, vs)) This should work: x[x==0] <- NA 来源: https://stackoverflow.com/questions/9822897/replace-0s-with-na-in-tables

Partially merge two datasets and fill in NAs in R

三世轮回 提交于 2019-12-01 15:29:08
I have two datasets a = raw dataset with thousands of observations of different weather events STATE EVTYPE 1 AL WINTER STORM 2 AL TORNADO 3 AL TSTM WIND 4 AL TSTM WIND 5 AL TSTM WIND 6 AL HAIL 7 AL HIGH WIND 8 AL TSTM WIND 9 AL TSTM WIND 10 AL TSTM WIND b = a dictionary table, which has a standard spelling for some weather events. EVTYPE evmatch 1 HIGH SURF ADVISORY <NA> 2 COASTAL FLOOD COASTAL FLOOD 3 FLASH FLOOD FLASH FLOOD 4 LIGHTNING LIGHTNING 5 TSTM WIND <NA> 6 TSTM WIND (G45) <NA> both are merged into df_new by evtype library(dplyr) df_new <- left_join(a, b, by = c("EVTYPE")) STATE

Partially merge two datasets and fill in NAs in R

这一生的挚爱 提交于 2019-12-01 13:32:47
问题 I have two datasets a = raw dataset with thousands of observations of different weather events STATE EVTYPE 1 AL WINTER STORM 2 AL TORNADO 3 AL TSTM WIND 4 AL TSTM WIND 5 AL TSTM WIND 6 AL HAIL 7 AL HIGH WIND 8 AL TSTM WIND 9 AL TSTM WIND 10 AL TSTM WIND b = a dictionary table, which has a standard spelling for some weather events. EVTYPE evmatch 1 HIGH SURF ADVISORY <NA> 2 COASTAL FLOOD COASTAL FLOOD 3 FLASH FLOOD FLASH FLOOD 4 LIGHTNING LIGHTNING 5 TSTM WIND <NA> 6 TSTM WIND (G45) <NA> both

R plotting a dataset with NA Values [duplicate]

泄露秘密 提交于 2019-12-01 11:45:48
This question already has an answer here: How to connect dots where there are missing values? 4 answers I'm trying to plot a dataset consisting of numbers and some NA entries in R. V1,V2,V3 2, 4, 3 NA, 5, 4 NA,NA,NA NA, 7, 3 6, 6, 9 Should return the same lines in the plot, as if I had entered: V1,V2,V3 2, 4, 3 3, 5, 4 4, 6, 3.5 5, 7, 3 6, 6, 9 What I need R to do is basically plotting the dataset as points, an then connect these points by straight lines, which - due to the size of the dataset - would be much more efficient then the actual calculation of each interpolated value within the

dplyr::mutate (assign na.rm =TRUE)

落花浮王杯 提交于 2019-12-01 08:24:02
I have a data.frame that has 100 variables. I want to get the sum of three variables only using mutate (not summarise ). If there is NA in any of the 3 variables, I still want to get the sum . In order to do this using mutate , I replaced all NA values with 0 using ifelse then I got the sum . library(dplyr) df %>% mutate(mod_var1 = ifelse(is.na(var1), 0, var1), mod_var2 = ifelse(is.na(var2), 0, var2), mod_var3 = ifelse(is.na(var3), 0, var3), sum = (mod_var1+mod_var2+mod_var3)) Is there any better (shorter) way to do this? DATA df <- read.table(text = c(" var1 var2 var3 4 5 NA 2 NA 3 1 2 4 NA 3

Identify NA's in sequence row-wise

空扰寡人 提交于 2019-12-01 08:13:40
I want to fill NA values in a sequence, which is row-wise, based on a condition. Please see example below. ID | Observation 1 | Observation 2 | Observation 3 | Observation 4 | Observation 5 A NA 0 1 NA NA The condition is: all NA values before !NA values in the sequence should be left as NA; but all NAs after !NA values in the sequence should be tagged ("remove") In the example above, NA value in Observation 1 should remain NA. However, the NA values in Observations 4 and 5 should be changed to "Remove". You can define the function: replace.na <- function(r,val) { i <- is.na(r) j <- which(i) k

dplyr::mutate (assign na.rm =TRUE)

自古美人都是妖i 提交于 2019-12-01 07:06:26
问题 I have a data.frame that has 100 variables. I want to get the sum of three variables only using mutate (not summarise ). If there is NA in any of the 3 variables, I still want to get the sum . In order to do this using mutate , I replaced all NA values with 0 using ifelse then I got the sum . library(dplyr) df %>% mutate(mod_var1 = ifelse(is.na(var1), 0, var1), mod_var2 = ifelse(is.na(var2), 0, var2), mod_var3 = ifelse(is.na(var3), 0, var3), sum = (mod_var1+mod_var2+mod_var3)) Is there any

Quick replace of NA - an error or warning

时间秒杀一切 提交于 2019-12-01 06:50:53
I have a big data.frame called "mat" of 49952 obs. of 7597 variables and I'm trying to replace NAs with zeros. Here is and example how my data.frame looks like: A B C E F D Q Z . . . 1 1 1 0 NA NA 0 NA NA 2 0 0 1 NA NA 0 NA NA 3 0 0 0 NA NA 1 NA NA 4 NA NA NA NA NA NA NA NA 5 0 1 0 1 NA 0 NA NA 6 1 1 1 0 NA 0 NA NA 7 0 0 1 0 NA 1 NA NA . . . I need realy fast tool to replace them. The result should look like: A B C E F D Q Z . . . 1 1 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 3 0 0 0 0 0 1 0 0 4 0 0 0 0 0 0 0 0 5 0 1 0 1 0 0 0 0 6 1 1 1 0 0 0 0 0 7 0 0 1 0 0 1 0 0 . . . I already tried lapply(mat,