Appendix B. Detail of the construction of the matrices of phylogenetic distances used for the species in the two experiments.
Phylogeny construction of the species:
In FebruaryMarch 2009, for each species we searched GenBank (Benson et al. 2005) for four gene sequences commonly used in published angiosperm phylogenies: matK, rbcl, ITS1 and 5.8s. Of the 21 species in the two experiments (including the two species omitted from the Biodepth data), 18 had at least one gene represented in Genbank and for a further 3 species, we used gene sequences from a congeneric relative not included in these experiments (Table B1). We also included two representatives of early diverging angiosperm lineages as outgroup species, Amborella trichopoda and Magnolia grandiflora. For these 23 species we aligned sequences using MUSCLE (Edgar 2004). We then selected best-fit maximum likelihood models of nucleotide substitution for each gene using the Akaike Information Criterion, as implemented in Modeltest (Posada and Crandall 1998, 2001).
Using the aligned sequences and the best-fit models of nucleotide substitution, we estimated a maximum likelihood phylogeny using the PHYML algorithm with a BIONJ starting tree (Guindon and Gascuel 2003, Anisimova and Gascuel 2006). To assess nodal support on maximum likelihood phylogenies, we report Approximate Likelihood Ratio Test (aLRT) scores, which have been shown to correlate with ML bootstrap scores, but are computationally more efficient (Guindon and Gascuel 2003). Maximum likelihood trees are shown in Fig. B1 and matrices of phylogenetic distances in Table B2.
|TABLE B1. Gene sequences used for the species from both the Ireland BIODEPTH and Jena experiments. Given are the Genbank accession codes. The two species omitted from the Irish Biodepth data are included.|
FIG. B1. The phylogeny for the species used in the two experiments. Scale bar is the distance for the number of nucleotide substitutions per 1000 nucleotides.
|TABLE B2. Phylogenetic distances among species from the (a) Jena and (b) Biodepth, Ireland experiments.|
Anisimova, M., and O. Gascuel. 2006. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst. Biol. 55:539552.
Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler. 2005. GenBank. Nucleic Acids Res. 33:D34D38.
Edgar, R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:17927.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817818.
Posada, D., and K. A. Crandall. 2001. Selecting the best-fit model of nucleotide substitution. Syst. Biol. 50:580601.
Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696704.