na

Replace NA with last non-NA in data.table by using only data.table

邮差的信 提交于 2019-12-04 03:54:33
问题 I want to replace NA values with last non-NA values in data.table and using data.table. I have one solution, but it's considerably slower than na.locf : library(data.table) library(zoo) library(microbenchmark) f1 <- function(x) { x[, X := na.locf(X, na.rm = F)] x } f2 <- function(x) { cond <- !is.na(x[, X]) x[, X := .SD[, X][1L], by = cumsum(cond)] x } m1 <- data.table(X = rep(c(NA,NA,1,2,NA,NA,NA,6,7,8), 100)) m2 <- data.table(X = rep(c(NA,NA,1,2,NA,NA,NA,6,7,8), 100)) microbenchmark(f1(m1),

NA in data.table

狂风中的少年 提交于 2019-12-04 03:23:25
I have a data.table that contains some groups. I operate on each group and some groups return numbers, others return NA . For some reason data.table has trouble putting everything back together. Is this a bug or am I misunderstanding? Here is an example: dtb <- data.table(a=1:10) f <- function(x) {if (x==9) {return(NA)} else { return(x)}} dtb[,f(a),by=a] Error in `[.data.table`(dtb, , f(a), by = a) : columns of j don't evaluate to consistent types for each group: result for group 9 has column 1 type 'logical' but expecting type 'integer' My understanding was that NA is compatible with numbers

R error in glmnet: NA/NaN/Inf in foreign function call

穿精又带淫゛_ 提交于 2019-12-04 03:09:05
问题 I am trying to create a model using glmnet, (currently using cv to find the lambda value) and I am getting an error NA/NaN/Inf in foreign function call (arg 5) . I believe this has something to do with the NA values in my data set, because when I remove all data points with NAs the command runs successfully. I was under the impression that glmnet can handle NA values. I'm not sure where the error is coming from: > res <- cv.glmnet(features.mat, as.factor(tmp[,"outcome"]), family="binomial")

Add NA value to ggplot legend for continuous data map

女生的网名这么多〃 提交于 2019-12-04 02:56:06
I'm using ggplot to map data values to a (fortified) SpatialPolygonsDataFrame, but many of the polygons have NA values because there is no data available. I used na.value = "white" to display the missing data correctly, but I'd like to add a box with a white fill in the legend (or a separate legend) with the label "no data". library(ggplot2) india.df <- read.csv('india.df.csv') # (I don't know how to provide this file to make the code reproducible) ggplot() + geom_polygon(data=india.df, aes(x = long, y = lat, group = group, fill=Area_pct)) + scale_fill_gradient(low="orange2", high="darkblue",

R plotting a dataset with NA Values [duplicate]

半腔热情 提交于 2019-12-04 02:01:21
问题 This question already has answers here : How to connect dots where there are missing values? (4 answers) Closed 6 years ago . I'm trying to plot a dataset consisting of numbers and some NA entries in R. V1,V2,V3 2, 4, 3 NA, 5, 4 NA,NA,NA NA, 7, 3 6, 6, 9 Should return the same lines in the plot, as if I had entered: V1,V2,V3 2, 4, 3 3, 5, 4 4, 6, 3.5 5, 7, 3 6, 6, 9 What I need R to do is basically plotting the dataset as points, an then connect these points by straight lines, which - due to

NA values not being excluded in `cor`

穿精又带淫゛_ 提交于 2019-12-03 23:58:08
To simplify, I have a data set which is as follows: b <- 1:6 # > b # [1] 1 2 3 4 5 6 jnk <- c(2, 4, 5, NA, 7, 9) # > jnk # [1] 2 4 5 NA 7 9 When I try: cor(b, jnk, na.rm=TRUE) I get: > cor(b, jnk, na.rm=T) Error in cor(b, jnk, na.rm = T) : unused argument (na.rm = T) I've also tried na.action = na.exclude , etc. None seem to work. It'd be really helpful to know what the issue is and how I can fix it. Thanks. Spacedman TL; DR: Use instead: cor(b, jnk, use="complete.obs") Read ?cor : cor(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman")) It doesn't have na.rm , it has

NA's are being plotted in boxplot ggplot2

青春壹個敷衍的年華 提交于 2019-12-03 23:24:09
I'm trying to plot a v. simple boxplot in ggplot2. I have species richness vs. landuse class. However, I have 2 NA's in my data. For some strange reason, they're being plotted, even when they're being understood as NA's by R. Any suggestion to remove them? The code I'm using is: ggplot(data, aes(x=luse, y=rich))+ geom_boxplot(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge", outlier.colour = "red", outlier.shape = 16, outlier.size = 2, notch = F, notchwidth = 0.5)+ scale_x_discrete("luse", drop=T)+ geom_smooth(method="loess",aes(group=1)) However, the graph includes 2 NA's for

R fill in NA with previous row value with condition

梦想与她 提交于 2019-12-03 16:26:23
I need to fill in NA rows with the previous row value, but only until a criteria is not changed. As a simple example for days of week, meals and prices: Day = c("Mon", "Tues", "Wed", "Thus", "Fri", "Sat","Sun","Mon", "Tues", "Wed", "Thus", "Fri", "Sat","Sun") Meal = c("B","B","B","B","B","D","D","D","D","L","L", "L","L","L") Price = c(NA, 20, NA,NA,NA,NA,NA,15,NA,NA,10,10,NA,10) df = data.frame(Meal,Day ,Price ) df Meal Day Price 1 B Mon NA 2 B Tues 20 3 B Wed NA 4 B Thus NA 5 B Fri NA 6 D Sat NA 7 D Sun NA 8 D Mon 15 9 D Tues NA 10 L Wed NA 11 L Thus 10 12 L Fri 10 13 L Sat NA 14 L Sun 10 I

dplyr idiom for summarize() a filtered-group-by, and also replace any NAs due to missing rows

柔情痞子 提交于 2019-12-03 16:25:11
I am computing a dplyr::summarize across a dataframe of sales data. I do a group-by (S,D,Y), then within each group, compute medians and means for weeks 5..43, then merge those back into the parent df. Variable X is sales. X is never NA (i.e. there are no explicit NAs anywhere in df), but if there is no data (as in, no sales) for that S,D,Y and set of weeks, there will simply be no row with those values in df (take it that means zero sales for that particular set of parameters). In other words, impute X=0 in any structurally missing rows (but I hope I don't need to melt/cast the original df,

R can't convert NaN to NA

牧云@^-^@ 提交于 2019-12-03 14:39:53
I have a data frame with several factor columns containing NaN 's that I would like to convert to NA 's (the NaN seems to be a problem for using linear regression objects to predict on new data). > tester1 <- c("2", "2", "3", "4", "2", "3", NaN) > tester1 [1] "2" "2" "3" "4" "2" "3" "NaN" > tester1[is.nan(tester1)] = NA > tester1 [1] "2" "2" "3" "4" "2" "3" "NaN" > tester1[is.nan(tester1)] = "NA" > tester1 [1] "2" "2" "3" "4" "2" "3" "NaN" Here's the problem: Your vector is character in mode, so of course it's "not a number". That last element got interpreted as the string "NaN". Using is.nan