How to fill NAs with LOCF by factors in data frame, split by country

后端 未结 8 1981
谎友^
谎友^ 2020-12-08 05:01

I have the following data frame (simplified) with the country variable as a factor and the value variable has missing values:

country value
AUT     NA
AUT            


        
8条回答
  •  忘掉有多难
    2020-12-08 05:38

    A combination of the packages dplyr and imputeTS can do the job.

    library(dplyr)
    library(imputeTS)
    data %>% group_by(country) %>% 
    mutate(value = na.locf(value, na.remaining="keep"))   
    

    With the na.remaining parameter of the na.locf function of imputeTS you have additionally the option to choose, what to do with the trailing NAs.

    These are the options:

    • "keep" - return the series with NAs
    • "rm" - remove remaining NAs
    • "mean" - replace remaining NAs by overall mean
    • "rev" - perform nocb / locf from the reverse direction

    By choosing "mean" you would for example get a result with 7 for every GER in the specific example.

提交回复
热议问题