na | 易学教程

Merging two data frames with different sizes and missing values

阅读更多关于 Merging two data frames with different sizes and missing values

问题 I'm having a problem merging two data frames in R. The first one consists of 103731 obs of 6 variables. The variable that I have to use to merge has 77111 unique values and the rest are NA s with a value of 0. The second one contains the frequency of those variables plus the frequency of the NA s so a frame of 77112 obs for 2 variables. The resulting frame I need to get is the first one joined with the frequency for the merging variable, so a df of 103731 obs with the frequency for each value

nexpected NA after using as.POSIXct(strptime

阅读更多关于 nexpected NA after using as.POSIXct(strptime

问题 I have a variable date with timestamps: 27.03.2016 01:30 27.03.2016 02:00 27.03.2016 02:30 The format is character . After using as.POSIXct(strptime(data.frame$date, format = "%d.%m.%Y %R")) two of the 17568 observations are NA. The others are printed correctly. Why does it happen? 来源： https://stackoverflow.com/questions/42504670/nexpected-na-after-using-as-posixctstrptime

nexpected NA after using as.POSIXct(strptime

阅读更多关于 nexpected NA after using as.POSIXct(strptime

Conditional imputation with LOCF

阅读更多关于 Conditional imputation with LOCF

问题 I've this example of longitudinal data. I need to impute 0, 999 or -1 values according to what occurs before. ID = c(1,1,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,5,6,6,6,6,6,6,6,6) Oxy = c(0, 999, 1, 999, 999, 0, 0, 999, 999, 0, 0, -1, 0, 999, 1, 1, -1, 1, 999, -1, 0, -1, 1,0, 999, 0) Y = c(2010,2011,2012,2013,2014,2011,2012,2013,2010,2011,2012,2010,2011, 2012,2010,2011,2012,2013,2014,2015,2016,2017, 2018,2019,2020, 2021) Oxy2 = c(0, 999, 1, 1, 1, 0, 0, 999, 999, 0, 0, -1, 0, 999, 1, 1, 1, 1, 999, -1, 0

Conditional imputation with LOCF

阅读更多关于 Conditional imputation with LOCF

Selecting correct join with data.table

阅读更多关于 Selecting correct join with data.table

问题 A follow-up from this question. I have three data tables (the actual input one is way bigger and performance matters, so I have to use data.table as much as I can): input <- fread(" ID | T1 | T2 | T3 | DATE ACC001 | 1 | 0 | 0 | 31/12/2016 ACC001 | 1 | 0 | 1 | 30/06/2017 ACC002 | 0 | 1 | 1 | 31/12/2016", sep = "|") mevs <- fread(" DATE | INDEX_NAME | INDEX_VALUE 31/12/2016 | GDP | 1.05 30/06/2017 | GDP | 1.06 31/12/2017 | GDP | 1.07 30/06/2018 | GDP | 1.08 31/12/2016 | CPI | 0.02 30/06/2017 |

R- date time variable loses format after ifelse

阅读更多关于 R- date time variable loses format after ifelse

问题 I have a variable in the proper POSIXct format, converted with ymd_hms(DateTime) {lubridate}. However, after a transformation the variable loses its POSIXct format: daily$DateTime<- ifelse(daily$ID %in% "r1_1"|daily$ID %in% "r1_2", NA,daily$DateTime) I try to convert the variable again to POSIXct with lubridate, but it seems it does´t like the NAs, and, in addition, now the variable DateTime has a num format that lubridate does´t recognise as a date and time format (e.g. 1377419400). Please,

how to use “NA” as string

阅读更多关于 how to use “NA” as string

问题 I have one csv file in which one column is character-type. Few values of that variable are NA (string). But when I am reading csv file in R using read.csv(), the "NA" strings are stored as NA. How can I fix it? 回答1: You can use the na.strings argument in read.csv : read.csv("myfile.csv", na.strings = "NNN") 回答2: Not sure what you are asking really, but a pseudo use could be something such as: if (null == column) { column = "NA"; } 来源： https://stackoverflow.com/questions/33126182/how-to-use-na

Carry Last Observation Forward by ID in R

阅读更多关于 Carry Last Observation Forward by ID in R

问题 I have daily observations with lots of missing values and am trying to propagate the first non-missing value through a vector for each individual. In the searching that I have done so far, I discovered the na.locf function in the zoo package; however, I now need to condition this function based on the id variable in my data frame. Is ddply the right function for this? If so, can someone help me please figure out how to get the output to be included in a new variable called result in the same

Calculate column medians with NA's

阅读更多关于 Calculate column medians with NA's

问题 I am trying to calculate the median of individual columns in R and then subtract the median value with every value in the column. The problem that I face here is I have N/A's in my column that I dont want to remove but just return them without subtracting the median. For example ID <- c("A","B","C","D","E") Point_A <- c(1, NA, 3, NA, 5) Point_B <- c(NA, NA, 1, 3, 2) df <- data.frame(ID,Point_A ,Point_B) Is it possible to calculate the median of a column having N/A's? My resulting output would