I have started using data.table for a large population model. So far, I have been impressed because using the data.table structure decreases my simulation run t
Struggling with column names is a strong indicator that the wide format is probably not the best choice for the given problem. Therefore, I suggest to do the computations in long form and to reshape the result from long to wide format, finally.
nYears = 10
params = data.table(Site = paste("Site", 1:3),
growthRate = c(1.1, 1.2, 1.3),
pop = c(10, 12, 13))
long <- params[CJ(Site = Site, Year = 0:nYears), on = "Site"][
, growth := cumprod(shift(growthRate, fill = 1)), by = Site][
, pop := pop * growth][]
dcast(long, Site + growthRate ~ sprintf("popYears%02i", Year), value.var = "pop")
Site growthRate popYears 0 popYears 1 popYears 2 popYears 3 popYears 4 popYears 5 popYears 6 popYears 7 popYears 8 popYears 9 popYears10 1: Site 1 1.1 10 11.0 12.10 13.310 14.6410 16.10510 17.71561 19.48717 21.43589 23.57948 25.93742 2: Site 2 1.2 12 14.4 17.28 20.736 24.8832 29.85984 35.83181 42.99817 51.59780 61.91736 74.30084 3: Site 3 1.3 13 16.9 21.97 28.561 37.1293 48.26809 62.74852 81.57307 106.04499 137.85849 179.21604
First, the parameters are expanded to cover 11 years (including year 0) using the cross join function CJ() and a subsequent right join on Site:
params[CJ(Site = Site, Year = 0:nYears), on = "Site"]
Site growthRate pop Year 1: Site 1 1.1 10 0 2: Site 1 1.1 10 1 3: Site 1 1.1 10 2 4: Site 1 1.1 10 3 5: Site 1 1.1 10 4 6: Site 1 1.1 10 5 7: Site 1 1.1 10 6 8: Site 1 1.1 10 7 9: Site 1 1.1 10 8 10: Site 1 1.1 10 9 11: Site 1 1.1 10 10 12: Site 2 1.2 12 0 13: Site 2 1.2 12 1 14: Site 2 1.2 12 2 15: Site 2 1.2 12 3 16: Site 2 1.2 12 4 17: Site 2 1.2 12 5 18: Site 2 1.2 12 6 19: Site 2 1.2 12 7 20: Site 2 1.2 12 8 21: Site 2 1.2 12 9 22: Site 2 1.2 12 10 23: Site 3 1.3 13 0 24: Site 3 1.3 13 1 25: Site 3 1.3 13 2 26: Site 3 1.3 13 3 27: Site 3 1.3 13 4 28: Site 3 1.3 13 5 29: Site 3 1.3 13 6 30: Site 3 1.3 13 7 31: Site 3 1.3 13 8 32: Site 3 1.3 13 9 33: Site 3 1.3 13 10 Site growthRate pop Year
Then the growth is computed from the shifted growth rates using the cumulative product function cumprod() separately for each Site. The shift is required to skip the initial year for each Site. Then the population is computed by multiplying with the intial population.
Finally, the data.table is reshaped from long to wide format using dcast(). The column headers are created on-the-fly using sprintf() to ensure the correct order of columns.