Phylogenetic model using multiple entries for each species

痞子三分冷 提交于 2019-12-11 07:27:41

问题


I am relatively new to phylogenetic regression models. In the past I used PGLS when I had only 1 entry for each species in my tree. Now I have a dataset with thousands of records for a total of 9 species and I would like to run a phylogenetic model. I read the tutorial of the most common packages (e.g. caper) but I am unsure how to build the model.

When I try to create the object for caper, i.e. using:

obj <- comparative.data(phy = Tree, data = Data, names.col = species, vcv = TRUE, na.omit = FALSE, warn.dropped = TRUE)

I get the message:

Error in row.names<-.data.frame(*tmp*, value = value) : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ‘Species1’, ‘Species2’, ‘Species3’, ‘Species4’, ‘Species5’, ‘Species6’, ‘Species7’, ‘Species8’, ‘Species9’

I understood that I may solve this by applying a MCMCglmm model but I am unfamiliar with Bayesian models.

Thanks in advance for your help.


回答1:


This is indeed not going to work with a simple PGLS from caper because it cannot deal with individuals as a random effect. I suggest you use MCMCglmm that is not much more complex to understand and will allow you to have individuals as a random effect. You can find excellent documentation from the package's author here or here or an alternative documentation that's more dealing with some specific aspects of the package (namely tree uncertainty) here.

Really briefly to get you going:

## Your comparative data
comp_data <- comparative.data(phy = my_tree, data =my_data,
      names.col = species, vcv = TRUE)

Note that you can have a specimen column that can look like this:

   taxa        var1 var2 specimen
1     A  0.08730689    a    spec1
2     B  0.47092692    a    spec1
3     C -0.26302706    b    spec1
4     D  0.95807782    b    spec1
5     E  2.71590217    b    spec1
6     A -0.40752058    a    spec2
7     B -1.37192856    a    spec2
8     C  0.30634567    b    spec2
9     D -0.49828379    b    spec2
10    E  1.42722363    b    spec2

You can then set up your formula (similar to a simple lm formula):

## Your formula
my_formula <- variable1 ~ variable2

And your MCMC settings:

## Setting the prior list (see the MCMCglmm course notes for details)
prior <- list(R = list(V=1, nu=0.002),
              G = list(G1 = list(V=1, nu=0.002)))

## Setting the MCMC parameters
## Number of interations
nitt <- 12000

## Length of burnin
burnin <- 2000

## Amount of thinning
thin <- 5

And you should then be able to run a default MCMCglmm:

## Extracting the comparative data
mcmc_data <- comp_data$data

## As MCMCglmm requires a colume named animal for it to identify it as a phylo
## model we include an extra colume with the species names in it.
mcmc_data <- cbind(animal = rownames(mcmc_data), mcmc_data)
mcmc_tree <- comp_data$phy

## The MCMCglmmm
mod_mcmc <- MCMCglmm(fixed = my_formual, 
                     random = ~ animal + specimen, 
                     family = "gaussian",
                     pedigree = mcmc_tree, 
                     data = mcmc_data,
                     nitt = nitt,
                     burnin = burnin,
                     thin = thin,
                     prior = prior)


来源:https://stackoverflow.com/questions/51132259/phylogenetic-model-using-multiple-entries-for-each-species

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!