na

R: filling up data gaps with NAs and applying cumsum function

馋奶兔 提交于 2019-12-14 03:14:43
问题 It was requested that I would break down my question asked here (R: Applying cumulative sum function and filling data gaps with NA for plotting) a little and post a smaller sample. Here it is and here you can find my sample data: https://dl.dropboxusercontent.com/u/16277659/inputdata.csv NAME; ID; SURVEY_YEAR; REFERENCE_YEAR; VALUE SAMPLE1; 253; 1883; 1883; 0 SAMPLE1; 253; 1884; 1883; NA SAMPLE1; 253; 1885; 1884; 12 SAMPLE1; 253; 1890; 1889; 17 SAMPLE2; 261; 1991; 1991; 0 SAMPLE2; 261; 1992;

Does a function like dist/rdist exist which handles NAs?

烂漫一生 提交于 2019-12-14 02:40:14
问题 I'm using rdist function from fields package, but now I want to handle NAs in my matrix, like the dist function does. There exist such a function? One solution would be to use dist directly, but my matrix has over 150K rows, so this is not an option. Edit: Note than removing rows or columns with complete.cases or na.omit is not the solution I'm looking for. The intended behaviour is described in the help dist function: Missing values are allowed, and are excluded from all computations

In R, can I make the table() function return the number of NA values in a named element?

别来无恙 提交于 2019-12-14 00:53:54
问题 I am using R to summarize a large amount of data for a report. I want to be able to use lapply() to generate a list of tables from the table() function, from which I can extract my desired statistics. There are a lot of these, so I've written a function to do it. My issue is that I am having difficulty returning the number of missing ( NA ) values even though I have that in each table, because I can't figure out how to tell R that I want the element from table() that holds the number of NA

How to find position of missing values in a vector

笑着哭i 提交于 2019-12-13 22:47:48
问题 What features does the R language have to find missing values in dataframe or at least, how to know that the dataframe has missing values? 回答1: x = matrix(rep(c(NA, 1,NA), 3), ncol=3, nrow=3) print(x) [,1] [,2] [,3] [1,] NA NA NA [2,] 1 1 1 [3,] NA NA NA matrix of boolean values: is the value NA is.na(x) [,1] [,2] [,3] [1,] TRUE TRUE TRUE [2,] FALSE FALSE FALSE [3,] TRUE TRUE TRUE indices of NA values: which(is.na(x), arr.ind = T) row col [1,] 1 1 [2,] 3 1 [3,] 1 2 [4,] 3 2 [5,] 1 3 [6,] 3 3

R sum consecutive duplicate rows and remove all but first

落爺英雄遲暮 提交于 2019-12-13 15:33:10
问题 I am stuck with a probably simple question - how to sum consecutive duplicate rows and remove all but first row. And, if there is a NA in between two duplicates (such as 2,na,2 ) , also sum them and remove all but the first entry. So far so good, here is my sample data ia<-c(1,1,2,NA,2,1,1,1,1,2,1,2) time<-c(4.5,2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3) a<-as.data.frame(cbind(ia, time)) sample output a ia time 1 1 4.5 2 1 2.4 3 2 3.6 4 NA 1.5 5 2 1.2 6 1 4.9 7 1 6.4 8 1 4.4 9 1 4.7 10 2

Why some functions do not ignore null values in R?

走远了吗. 提交于 2019-12-13 11:14:21
问题 I have a time series of daily returns. Observations for which no data were available have value NaN . Trying to apply functions such as StdDev from the PerformanceAnalytics package the function correctly performs calculations and returns the standard deviation for only the not Null values. Trying to apply functions such as mean , min , max ... return instead a wrong result, i.e. NaN . There is probably something to specify in the " mean " function? 回答1: From ?mean: na.rm a logical value

Counting the NA's in a part of a row in data.table

五迷三道 提交于 2019-12-13 09:42:20
问题 I have a dataset df of which the structure looks similar to the example below: nr countrycode questionA questionB questionC WeightquestionA WeightquestionB WeightquestionC 1 NLD 2 1 4 0.6 0.2 0.2 2 NLD NA 4 NA 0.4 0.4 0.2 3 NLD 4 4 1 0.2 0.2 0.6 4 BLG 1 NA 1 0.1 0.5 0.4 5 BLG 5 3 5 0.2 0.2 0.6 The questions A, B and C relate to a similar topic and as a result I would like to create an average score for all questions, taking into account the importance of each question ( WeightquestionA

NA when trying to summarize a subset of data (R)

拟墨画扇 提交于 2019-12-13 07:26:30
问题 Whole vector is ok and has no NAs : > summary(data$marks) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 6.00 6.00 6.02 7.00 7.00 > length(data$marks) [1] 2528 However, when trying to calculate a subset using a criteria I receive lots of NAs : > summary(data[data$student=="John",]$marks) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1.000 6.000 6.000 6.169 7.000 7.000 464 > length(data[data$student=="John",]$marks) [1] 523 回答1: I think the problem is that you have missing values for student . As a

Populate the NA values in a variable with values from a different variables in R

爱⌒轻易说出口 提交于 2019-12-13 05:07:22
问题 I have data which looks like this Linking <- data.frame( ID = c(round((runif(20, min = 10000, max = 99999)), digits = 0), NA, NA, NA, NA), PSU = c(paste("A", round((runif(20, min = 10000, max = 99999)), digits = 0), sep = ''), NA, NA, NA, NA), qtr = c(rep(1:10, 2), NA, NA, NA, NA) ) Linking$Key <- paste(Linking$ID, Linking$PSU, Linking$qtr, sep = "_") Linking$Key[c(21:24)] <- c("87654_A15467_1", "45623_A23456_2", "67891_A12345_4", "65346_A23987_7") What I want to do is populate the NA values

R lapply convert NA's to 0

风格不统一 提交于 2019-12-13 04:53:42
问题 I'm trying to convert a subset of columns from NA's to 0's using the following code. Unfortunately it turns all the cells to 0's. df1 <- data.frame(id = 1:20, col1 = runif(20), col2 = runif(20), col3 = runif(20)) df1[sample(1:20,5),'col1'] <- NA df1[sample(1:20,5),'col2'] <- NA df1[sample(1:20,5),'col3'] <- NA subset1 <- c('col1','col2','col3') df1[,subset1] <- as.data.frame(lapply(df1[,subset1], function(x) x[is.na(x)] <- 0)) Any suggestions? 回答1: Try this simple approach df1[is.na(df1),] <-