Applying a function for multiple groups using dplyr

匆匆过客 提交于 2019-12-11 00:57:25

问题


I have some data for multiple location and year

big.data <- data.frame(loc.id = rep(1:3, each = 10*3), 
                   year = rep(rep(1981:1983, each = 10),times = 3), 
                   day = rep(1:10, times = 3*3),
                   CN = rep(c(50,55,58), each = 10*3),
                   top.FC = rep(c(72,76,80),each = 10*3),
                   DC = rep(c(0.02,0.5,0.8), each = 10*3),
                   WAT0 = rep(c(20,22,26), each = 10*3),
                   Precp = sample(1:100,90, replace = T),
                   ETo = sample(1:10,90, replace = T)) 

I have a function: water.model which uses a second function internally called water.update

water.model <- function(dat){

     top.FC  <- unique(dat$top.FC)    

     dat$WAT <- -9.9
     dat$RO <- -9.9
     dat$DR <- -9.9

     dat$WAT[1] <- top.FC/2 # WAT.i is a constant 
     dat$RO[1] <- NA 
     dat$DR[1] <- NA

     for(d in 1:(nrow(dat)-1)){

       dat[d + 1,10:12] <- water.update(WAT0 = dat$WAT[d], 
                                        RAIN.i = dat$Precp[d + 1], 
                                        ETo.i = dat$ETo[d + 1], 
                                        CN = unique(dat$CN), 
                                        DC = unique(dat$DC),
                                        top.FC = unique(dat$top.FC))
     }
     return(dat)
   }



water.update <- function(WAT0, RAIN.i, ETo.i, CN, DC, top.FC){ 

        S = 25400/CN - 254;  IA = 0.2*S

        if (RAIN.i > IA) { RO = (RAIN.i - 0.2 * S)^2/(RAIN.i + 0.8 * S)
          } else { 
            RO = 0 
            }

          if (WAT0 + RAIN.i - RO > top.FC) { 
              DR = DC * (WAT0 + RAIN.i - RO - top.FC) 
              } else { 
              DR = 0 
            }    
        dWAT = RAIN.i - RO - DR - ETo.i
        WAT1 = WAT0 + dWAT
        WAT1 <- ifelse(WAT1 < 0, 0, WAT1) 
        return(list(WAT1,RO,DR))
    } 

If I run the above function for a single location X year

big.data.sub <- big.data[big.data$loc.id == 1 & big.data$year == 1981,]
water.model(big.data.sub)

   loc.id year day CN top.FC   DC WAT0 Precp ETo      WAT        RO       DR
   1       1 1981   1 50     72 0.02   20    52   5  36.0000        NA       NA
   2       1 1981   2 50     72 0.02   20    12   9  39.0000 0.0000000 0.000000
   3       1 1981   3 50     72 0.02   20     3   2  40.0000 0.0000000 0.000000
   4       1 1981   4 50     72 0.02   20    81   9 107.8750 3.2091485 0.915817
   5       1 1981   5 50     72 0.02   20    37  10 133.4175 0.0000000 1.457501
   6       1 1981   6 50     72 0.02   20    61   7 184.5833 0.3937926 2.440475
   7       1 1981   7 50     72 0.02   20    14  10 186.0516 0.0000000 2.531665
   8       1 1981   8 50     72 0.02   20     9   6 186.5906 0.0000000 2.461032
   9       1 1981   9 50     72 0.02   20    77   9 248.3579 2.4498216 3.782815
   10      1 1981  10 50     72 0.02   20    18   6 256.4708 0.0000000 3.887159

How do I run this for all location and year?

big.data %>% group_by(loc.id, year) %>% # apply my function here.

My final data should look like the above with three new columns called WAT, RO and DR which are generated when the function is run.


回答1:


We can split the data and apply the water.model by looping over the list with map

library(tidyverse)
split(big.data, big.data[c('loc.id', 'year')], drop = TRUE) %>% 
           map_df(water.model)

Or apply the function within do after group_by

big.data %>%
   group_by(loc.id, year) %>%
   do(data.frame(water.model(.)))


来源:https://stackoverflow.com/questions/50393990/applying-a-function-for-multiple-groups-using-dplyr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!