I am new to R and having trouble figuring out to go about this. I have data on tree growth rates from dead trees, organized by year. So, my first column is year and the columns to the right are growth rates for individual trees, ending in the year each tree died. After the tree died, the values are "NA" for the remaining years in the dataset. I need to take the mean growth for the 10 years preceding each tree's death, but each tree died in a different year. Does anyone have an idea for how to do this? Here is an example of what a dataset might look like:
Year Tree1 Tree2 Tree3
1989 53.00 84.58 102.52
1990 63.68 133.16 146.07
1991 90.37 103.10 233.58
1992 149.24 127.61 245.69
1993 96.20 54.78 417.96
1994 230.64 60.92 125.31
1995 150.81 60.98 100.43
1996 124.25 42.73 75.43
1997 173.42 67.20 50.34
1998 119.60 73.40 32.43
1999 179.97 61.24 NA
2000 114.88 67.43 NA
2001 82.23 55.23 NA
2002 49.40 NA NA
2003 93.46 NA NA
2004 104.67 NA NA
2005 44.14 NA NA
2006 88.40 NA NA
So, the averages I need to calculate are:
Tree1: mean(1997-2006) = 105.01
Tree2: mean(1992-2001) = 67.15
Tree3: mean(1989-1998) = 152.98
Since I need to do this for a large number of trees, it would be helpful to have a method of automating the calculation. Thank you very much for any help! Katie
You can use sapply
and tail
together with na.omit
as follows:
sapply(mydf[-1], function(x) mean(tail(na.omit(x), 10)))
# Tree1 Tree2 Tree3
# 105.017 67.152 152.976
mydf[-1]
says to drop the first column. tail
has an argument, n
, that lets you specify how many values you want from the end (tail) of your data. Here, we've set it to "10" since you want the last 10 values. Then, assuming that there are no NA
values in your actual data from while the trees are alive, you can safely use na.omit
on your data.
来源:https://stackoverflow.com/questions/13956820/how-to-take-the-mean-of-last-10-values-in-a-column-before-a-missing-value-using