na | 易学教程

remove columns with NAs from all dataframes in list

阅读更多关于 remove columns with NAs from all dataframes in list

问题 I have a list made up of several data frames. I would like to remove all of the columns with NAs in each data frame. Note the columns to be removed are not the same in each data frame. Sample data provided below. Any suggestions much appreciated. WW1_Data <- structure(list(Alnön = structure(list(Site_Name = structure(1L, .Label = c("Alnön","Ammarnäs", "Anjan", "Bäcksand", "Fittjebodarna", "Flatruet", "Glen", "Idre", "Klångstavallen", "Kramfors", "Ljungdalen", "Ljungris", "Mårdsund", "Mörtsjön

remove columns with NAs from all dataframes in list

阅读更多关于 remove columns with NAs from all dataframes in list

Replacing Missing Value in R

阅读更多关于 Replacing Missing Value in R

问题 I have to replace the missing value to maximum (Value) by ID. How to do in R ID Value 1 NA 5 15 8 16 6 8 7 65 8 NA 5 25 1 62 6 14 7 NA 9 11 8 12 9 36 1 26 4 13 回答1: I would first precompute the max values using a call to aggregate() , and also precompute which rows of the data.frame have an NA value. Then you can match the IDs into the aggregation table to extract the corresponding max value. maxes <- aggregate(Value~ID,df,max,na.rm=T); nas <- which(is.na(df$Value)); df$Value[nas] <- maxes

specifying “skip NA” when calculating mean of the column in a data frame created by Pandas

阅读更多关于 specifying “skip NA” when calculating mean of the column in a data frame created by Pandas

问题 I am learning Pandas package by replicating the outing from some of the R vignettes. Now I am using the dplyr package from R as an example: http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html R script planes <- group_by(hflights_df, TailNum) delay <- summarise(planes, count = n(), dist = mean(Distance, na.rm = TRUE)) delay <- filter(delay, count > 20, dist < 2000) Python script planes = hflights.groupby('TailNum') planes['Distance'].agg({'count' : 'count', 'dist' : 'mean'})

specifying “skip NA” when calculating mean of the column in a data frame created by Pandas

阅读更多关于 specifying “skip NA” when calculating mean of the column in a data frame created by Pandas

scale_fill_manual define color for NA values

阅读更多关于 scale_fill_manual define color for NA values

问题 I try to make a barplot with ggplot2 and am facing some issues with defining the color for NA. ggh <- ggplot(data=dat, aes(x=var1, fill=var2))+ geom_bar(position="dodge")+ scale_fill_manual( values=c("s"="steelblue", "i"="darkgoldenrod2", "r"="firebrick4", na.value="black")) In my var2 I have values c("s", "i", "r", NA) . For some reason my code above inside the scale_fill_manual does not work for NA, even if it works fine for all the others values. Can someone help me figure out why? Thanks

How to remove NA values in vector in R [duplicate]

阅读更多关于 How to remove NA values in vector in R [duplicate]

问题 This question already has answers here : Remove NA values from a vector (6 answers) Closed 5 years ago . I have a vector which stores over 1000 values. The first 50 values are NAs, how can I get rid of it? c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.5741, 1.583, 1.605, 1.633, 1.6465, 1.6475, 1.6329, 1.6413, 1.685, 1.692, 1.7087, 1.7055

Getting boolean pandas column that supports NA/ is nullable

阅读更多关于 Getting boolean pandas column that supports NA/ is nullable

问题 How can I create a pandas dataframe column with dtype bool (or int for that matter) with support for Nan/missing values? When I try like this: d = {'one' : np.ma.MaskedArray([True, False, True, True], mask = [0,0,1,0]), 'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])} df = pd.DataFrame(d) print (df.dtypes) print (df) column one is implicitly converted to object. Likewise similar for ints : d = {'one' : np.ma.MaskedArray([1,3,2,1], mask = [0,0,1,0]), 'two' : pd.Series([1., 2.,

dplyr join define NA values

阅读更多关于 dplyr join define NA values

问题 Can I define a "fill" value for NA in dplyr join? For example in the join define that all NA values should be 1? require(dplyr) lookup <- data.frame(cbind(c("USD","MYR"),c(0.9,1.1))) names(lookup) <- c("rate","value") fx <- data.frame(c("USD","MYR","USD","MYR","XXX","YYY")) names(fx)[1] <- "rate" left_join(x=fx,y=lookup,by=c("rate")) Above code will create NA for values "XXX" and "YYY". In my case I am joining a large number of columns and there will be a lot of non-matches. All non-matches

What's the best way to replace missing values with NA when reading in a .csv?

阅读更多关于 What's the best way to replace missing values with NA when reading in a .csv?

问题 I have a .csv dataset with many missing values, and I'd like R to recognize them all the same way (the "correct" way) when I read the table in. I've been using: import = read.csv("/Users/dataset.csv", header =T, na.strings=c("")) This script fills all the empty cells with something, but it's not consistant. When I look at the data with head(import) , some missing cells are filled with <NA> and some missing cells are filled with NA . I fear that R treats these two ways of identifying missing