na | 易学教程

R: filling up data gaps with NAs and applying cumsum function

阅读更多关于 R: filling up data gaps with NAs and applying cumsum function

问题 It was requested that I would break down my question asked here (R: Applying cumulative sum function and filling data gaps with NA for plotting) a little and post a smaller sample. Here it is and here you can find my sample data: https://dl.dropboxusercontent.com/u/16277659/inputdata.csv NAME; ID; SURVEY_YEAR; REFERENCE_YEAR; VALUE SAMPLE1; 253; 1883; 1883; 0 SAMPLE1; 253; 1884; 1883; NA SAMPLE1; 253; 1885; 1884; 12 SAMPLE1; 253; 1890; 1889; 17 SAMPLE2; 261; 1991; 1991; 0 SAMPLE2; 261; 1992;

Does a function like dist/rdist exist which handles NAs?

阅读更多关于 Does a function like dist/rdist exist which handles NAs?

问题 I'm using rdist function from fields package, but now I want to handle NAs in my matrix, like the dist function does. There exist such a function? One solution would be to use dist directly, but my matrix has over 150K rows, so this is not an option. Edit: Note than removing rows or columns with complete.cases or na.omit is not the solution I'm looking for. The intended behaviour is described in the help dist function: Missing values are allowed, and are excluded from all computations

In R, can I make the table() function return the number of NA values in a named element?

阅读更多关于 In R, can I make the table() function return the number of NA values in a named element?

问题 I am using R to summarize a large amount of data for a report. I want to be able to use lapply() to generate a list of tables from the table() function, from which I can extract my desired statistics. There are a lot of these, so I've written a function to do it. My issue is that I am having difficulty returning the number of missing ( NA ) values even though I have that in each table, because I can't figure out how to tell R that I want the element from table() that holds the number of NA

How to find position of missing values in a vector

阅读更多关于 How to find position of missing values in a vector

问题 What features does the R language have to find missing values in dataframe or at least, how to know that the dataframe has missing values? 回答1: x = matrix(rep(c(NA, 1,NA), 3), ncol=3, nrow=3) print(x) [,1] [,2] [,3] [1,] NA NA NA [2,] 1 1 1 [3,] NA NA NA matrix of boolean values: is the value NA is.na(x) [,1] [,2] [,3] [1,] TRUE TRUE TRUE [2,] FALSE FALSE FALSE [3,] TRUE TRUE TRUE indices of NA values: which(is.na(x), arr.ind = T) row col [1,] 1 1 [2,] 3 1 [3,] 1 2 [4,] 3 2 [5,] 1 3 [6,] 3 3

R sum consecutive duplicate rows and remove all but first

阅读更多关于 R sum consecutive duplicate rows and remove all but first

问题 I am stuck with a probably simple question - how to sum consecutive duplicate rows and remove all but first row. And, if there is a NA in between two duplicates (such as 2,na,2 ) , also sum them and remove all but the first entry. So far so good, here is my sample data ia<-c(1,1,2,NA,2,1,1,1,1,2,1,2) time<-c(4.5,2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3) a<-as.data.frame(cbind(ia, time)) sample output a ia time 1 1 4.5 2 1 2.4 3 2 3.6 4 NA 1.5 5 2 1.2 6 1 4.9 7 1 6.4 8 1 4.4 9 1 4.7 10 2

Why some functions do not ignore null values in R?

阅读更多关于 Why some functions do not ignore null values in R?

问题 I have a time series of daily returns. Observations for which no data were available have value NaN . Trying to apply functions such as StdDev from the PerformanceAnalytics package the function correctly performs calculations and returns the standard deviation for only the not Null values. Trying to apply functions such as mean , min , max ... return instead a wrong result, i.e. NaN . There is probably something to specify in the " mean " function? 回答1: From ?mean: na.rm a logical value

Counting the NA's in a part of a row in data.table

阅读更多关于 Counting the NA's in a part of a row in data.table

问题 I have a dataset df of which the structure looks similar to the example below: nr countrycode questionA questionB questionC WeightquestionA WeightquestionB WeightquestionC 1 NLD 2 1 4 0.6 0.2 0.2 2 NLD NA 4 NA 0.4 0.4 0.2 3 NLD 4 4 1 0.2 0.2 0.6 4 BLG 1 NA 1 0.1 0.5 0.4 5 BLG 5 3 5 0.2 0.2 0.6 The questions A, B and C relate to a similar topic and as a result I would like to create an average score for all questions, taking into account the importance of each question ( WeightquestionA

NA when trying to summarize a subset of data (R)

阅读更多关于 NA when trying to summarize a subset of data (R)

问题 Whole vector is ok and has no NAs : > summary(data$marks) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 6.00 6.00 6.02 7.00 7.00 > length(data$marks) [1] 2528 However, when trying to calculate a subset using a criteria I receive lots of NAs : > summary(data[data$student=="John",]$marks) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1.000 6.000 6.000 6.169 7.000 7.000 464 > length(data[data$student=="John",]$marks) [1] 523 回答1: I think the problem is that you have missing values for student . As a

Populate the NA values in a variable with values from a different variables in R

阅读更多关于 Populate the NA values in a variable with values from a different variables in R

问题 I have data which looks like this Linking <- data.frame( ID = c(round((runif(20, min = 10000, max = 99999)), digits = 0), NA, NA, NA, NA), PSU = c(paste("A", round((runif(20, min = 10000, max = 99999)), digits = 0), sep = ''), NA, NA, NA, NA), qtr = c(rep(1:10, 2), NA, NA, NA, NA) ) Linking$Key <- paste(Linking$ID, Linking$PSU, Linking$qtr, sep = "_") Linking$Key[c(21:24)] <- c("87654_A15467_1", "45623_A23456_2", "67891_A12345_4", "65346_A23987_7") What I want to do is populate the NA values

R lapply convert NA's to 0

阅读更多关于 R lapply convert NA's to 0

问题 I'm trying to convert a subset of columns from NA's to 0's using the following code. Unfortunately it turns all the cells to 0's. df1 <- data.frame(id = 1:20, col1 = runif(20), col2 = runif(20), col3 = runif(20)) df1[sample(1:20,5),'col1'] <- NA df1[sample(1:20,5),'col2'] <- NA df1[sample(1:20,5),'col3'] <- NA subset1 <- c('col1','col2','col3') df1[,subset1] <- as.data.frame(lapply(df1[,subset1], function(x) x[is.na(x)] <- 0)) Any suggestions? 回答1: Try this simple approach df1[is.na(df1),] <-