na

A simpler way to achieve a frequency count with mean, sum, length and sd in R

╄→尐↘猪︶ㄣ 提交于 2019-12-10 23:37:25
问题 I've been tasked with creating frequency tables with statistical summaries. My goal is to create a data frame that can be exported simply to excel. Most of this could be in sql using stored procedures but I decided to do this in R. I'm learning R so I might be doing it the long way. This is a follow on question from getting-r-frequency-counts-for-all-possible-answers Given Id <- c(1,2,3,4,5,6,7,8,9,10) ClassA <- c(1,NA,3,1,1,2,1,4,5,3) ClassB <- c(2,1,1,3,3,2,1,1,3,3) R <- c(1,2,3,NA,9,2,4,5

Get length of runs of missing values in vector

妖精的绣舞 提交于 2019-12-10 16:14:50
问题 What's a clever (i.e., not a loop) way to get the length of each spell of missing values in a vector? My ideal output is a vector that is the same length, in which each missing value is replaced by the length of the spell of missing values of which it was a part, and all other values are 0's. So, for input like: x <- c(2,6,1,2,NA,NA,NA,3,4,NA,NA) I'd like output like: y <- c(0,0,0,0,3,3,3,0,0,2,2) 回答1: One simple option using rle : m <- rle(is.na(x)) > rep(ifelse(m$values,m$lengths,0),times =

data.table 1.8.x mean() function auto removing NA?

限于喜欢 提交于 2019-12-10 13:24:54
问题 Today I found out a bug in my program due to data.table auto remove NA for mean for example: > a<-data.table(a=c(NA,NA,FALSE,FALSE), b=c(1,1,2,2)) > a > a[,list(mean(a), sum(a)),by=b] b V1 V2 1: 1 0 NA // Why V1 = 0 here? I had expected NA 2: 2 0 0 > mean(c(NA,NA,FALSE,FALSE)) [1] NA > mean(c(NA,NA)) [1] NA > mean(c(FALSE,FALSE)) [1] 0 Is this the intended behaviour? 回答1: This isn't intended. Looks like a problem with optimization ... > a[,list(mean(a), sum(a)),by=b] b V1 V2 1: 1 0 NA 2: 2 0

Using foreach loop in r returning NA

喜你入骨 提交于 2019-12-10 11:44:36
问题 I would like to use the "foreach" loop in R (package foreach + doParallel) but in my work i found that the loop returns some NA and the classic "for" loop returns the value I want : library(foreach) library(doParallel) ncore=as.numeric(Sys.getenv('NUMBER_OF_PROCESSORS'))-1 registerDoParallel(cores=ncore) B=2 a = vector() b = vector() foreach(i = 1:B, .packages = "ez",.multicombine = T,.inorder = T, .combine = 'c')%dopar%{ a[i] = i + 1 return(a) } for(i in 1:B){ b[i] = i + 1 b } As you can see

R conditional replace more columns by lookup

≡放荡痞女 提交于 2019-12-10 11:14:07
问题 Lets say we do have lots of data columns (with names mycols and also some unnamed ones that should not be processed in this case) in dataframe df1 and a column subj which is also an index to another dataframe df2 with columns repl and subj (in this second dataframe is subj unique) and much other nonimportant columns (their only role in this is, that we cannot suppose that there are just 2 columns). I would like to replace a subset of columns ( df1[,mycols] ) in such a way, that if there is an

Unlist a column while retaining character(0) as empty strings in R

萝らか妹 提交于 2019-12-10 10:54:11
问题 I am relatively new to R. I have a dataframe that has a column stored as a list. My column contain c("Benzo", "Ferri") or character(0) if it's empty. How can I change them to simply Benzo, Ferri and an empty string for character(0) instead? I'm not able to, for instance df$general_RN <- unlist(df$general_RN) because Error in $<-.data.frame(*tmp*, general_RN, value = c("Drug Combinations", : replacement has 1992 rows, data has 10479 I am assuming that all the character(0) have been removed but

Pad each element in a list to specific length in R

怎甘沉沦 提交于 2019-12-10 10:12:04
问题 Here is a simple r question which basically pertains to correctly understanding list syntax I think. I have a series of matrices loaded into a list (following some preliminary calculations) which I then want to conduct some basic block averaging on. My basic workflow will be as follows: 1) Rounding each vector contained within a list to an integer corresponding to the number of blocks I am interested in averaging to. 2) Padding each vector in a list to this new length. 3) Conversion of each

What does is.na() applied to non-(list or vector) of type 'NULL' mean?

孤街浪徒 提交于 2019-12-10 02:44:07
问题 I want to select a Cox model with the forward procedure from a data.frame with no NA. Here is some sample data: test <- data.frame( x_1 = runif(100,0,1), x_2 = runif(100,0,5), x_3 = runif(100,10,20), time = runif(100,50,200), event = c(rep(0,70),rep(1,30)) ) This table has no signification but if we try to build a model anyway : modeltest <- coxph(Surv(time, event) ~1, test) modeltest.forward <- step( modeltest, data = test, direction = "forward", scope = list(lower = ~ 1, upper = ~ x_1 + x_2

Omit NA and data imputation before doing PCA analysis using R

陌路散爱 提交于 2019-12-09 20:14:36
问题 I am trying to do PCA analysis using princomp function in R. The following is the example code: mydf <- data.frame ( A = c("NA", rnorm(10, 4, 5)), B = c("NA", rnorm(9, 4, 5), "NA"), C = c("NA", "NA", rnorm(8, 4, 5), "NA") ) out <- princomp(mydf, cor = TRUE, na.action=na.exclude) Error in cov.wt(z) : 'x' must contain finite values only I tried to remove the NA from the dataset, but it does not work. ndnew <- mydf[complete.cases(mydf),] A B C 1 NA NA NA 2 1.67558617743171 1.28714736288378 NA 3

R: Updating NAs in a data table with values of another data table

北慕城南 提交于 2019-12-09 19:37:03
问题 There are two data tables of the following structure: DT1 <- data.table(ID=c("A","B","C"), P0=c(1,10,100), key="ID") DT2 <- data.table(ID=c("B","B","B","A","A","A","C","C","C"), t=rep(seq(0:2),3), P=c(NA,30,50,NA,4,6,NA,200,700)) In data table DT2 all NAs in column P shall be updated by values P0 out of data table DT1 . If DT2 is ordered by ID like DT1 , the problem can be solved like this: setorder(DT2,ID) idxr <- which(DT2[["t"]]==1) set(DT2, i=idxr, j="P", value=DT1[["P0"]]) But how can