Creating a function to replace NAs from one data.frame with values from another

后端未结

关注

 3  550

鱼传尺愫 2020-12-30 02:56

I regularly have situations where I need to replace missing values from a data.frame with values from some other data.frame that is at a different level of aggregation. So,

3条回答

天命终不由人 (楼主)

2020-12-30 03:57

Here's a slightly more concise/robust version of your approach. You could replace the for-loop with a call to lapply, but I find the loop easier to read.

This function assumes any columns not in mergeCols are fair game to have their NAs filled. I'm not really sure this helps, but I'll take my chances with the voters.

fillNaDf.ju <- function(naDf, fillDf, mergeCols) {
  mergedDf <- merge(fillDf, naDf, by=mergeCols, suffixes=c(".fill",""))
  dataCols <- setdiff(names(naDf),mergeCols)
  # loop over all columns we didn't merge by
  for(col in dataCols) {
    rows <- is.na(mergedDf[,col])
    # skip this column if it doesn't contain any NAs
    if(!any(rows)) next
    rows <- which(rows)
    # replace NAs with values from fillDf
    mergedDf[rows,col] <- mergedDf[rows,paste(col,"fill",sep=".")]
  }
  # don't return ".fill" columns
  mergedDf[,names(naDf)]
}

0 讨论(0)

查看其它3个回答