|
|
|
|||
| Home Help Feedback Subscriptions Archive Search Table of Contents | ||||
First published online August 4, 2005
Journal of Experimental Biology 208, 3015-3035 (2005)
Published by The Company of Biologists 2005
doi: 10.1242/jeb.01745
Commentary |
Phylogenetic approaches in comparative physiology
1 Department of Biology, University of California, Riverside, CA 92521,
USA
2 Department of Ecology and Evolutionary Biology, University of California,
Irvine, CA 92697, USA
* Author for correspondence (e-mail: abennett{at}uci.edu)
Accepted 13 June 2005
| Summary |
|---|
|
|
|---|
Key words: allometry, comparative method, evolutionary physiology, model of evolution, phylogeny, statistical analysis
| Introduction |
|---|
|
|
|---|
Comparative methods have been radically restructured over the past two
decades, and now routinely incorporate both phylogenetic information and
explicit models of character evolution. Indeed, Sanford et al.
(2002
) suggest that this new
emphasis be termed the `comparative phylogenetic method'. As outlined
in Blomberg and Garland
(2002
), this revolution in
comparative phylogenetic methodology followed from several conceptual
advances: (1) adaptation should not be casually inferred from comparative
data; (2) the incorporation of phylogenetic information increases both the
quality and even the type of inference from comparative data alone; (3)
because all organisms are differentially related to each other, taxa cannot be
assumed to be independent of each other for statistical purposes; (4)
statistical analyses of comparative data must assume some model of character
evolution for effective inference; (5) taxa used in comparative analyses
should be chosen in regard to their phylogenetic affinities as well as the
area of functional investigation; and (6) even phylogenetically based
comparisons are purely correlational and inferences of causation drawn from
them can be enhanced by other approaches, including experimental
manipulations.
To expand on some of these points, `quality' in point 2 includes the simple
fact that adding an independent estimate of phylogenetic relationships to a
comparative analysis increases often greatly the amount of
basic data that is brought to bear on a given question, whereas `type' refers
to analyses that are simply impossible without a phylogenetic perspective,
such as reconstructing ancestral values or comparing rates of evolution among
lineages. Although phylogenetic information and a suitable analytical method
may allow any comparative data set to be `rescued' from phylogenetic
nonindependence (e.g. avoid inflated Type I error rates; point 3),
phylogenetically informed choice of species (point 5) can accomplish more,
such as actually increasing statistical power to detect relationships among
traits (Garland et al., 1993
;
Garland, 2001
). Finally, we
note that point 6 was recognized long ago, but has been re-emphasized as
phylogenetically explicit methods of statistical inference have been developed
(e.g. see Lauder, 1990
;
Garland and Adolph, 1994
;
Leroi et al., 1994
;
Autumn et al., 2002
).
The intent of this commentary is to provide a review of some advances that
have occurred in the comparative method, with an emphasis on their place in
comparative physiology. We examine the underlying reasons for the
incorporation of phylogenetic information into comparative studies. In an
Appendix, we give a brief overview of the three most commonly used and best
understood phylogenetically based statistical methods: independent contrasts
(IC; worked example in Fig. 5), generalized least-squares (GLS) models, and
Monte Carlo computer simulations. These methods apply mainly to analysis of
continuously varying (or at least quantitative) traits, which is the nature of
most physiological traits (e.g. blood pressure, metabolic rate, enzyme
activity). However, they can also easily incorporate independent variables
that are treated as discrete categories, such as diet (e.g. insectivore,
frugivore, sanguivore) or habitat (e.g. fresh or salt water). Discussions of
methods for categorical traits and computer programs to implement them are
available from Mark Pagel (e.g. see
Pagel, 1999
), in MacClade
(Maddison and Maddison,
2000
), and in Mesquite
(http://mesquiteproject.org/mesquite/mesquite.html;
see also Paradis and Claude,
2002
). For a general listing of phylogeny-related programs, see
the website maintained by Joe Felsenstein
(http://evolution.genetics.washington.edu/phylip/software.html).
We discuss when phylogenetically based statistical methods should be used
and give some practical examples of where a phylogenetic perspective has
improved our understanding of comparative data and evolutionary processes. We
also discuss some of the practical and theoretical limitations of such
methods. Throughout, we try to emphasize that the incorporation of phylogeny
can greatly enhance comparative studies, deliver new insights, and open new
areas for research. This is of necessity only a brief summary and readers are
directed to more extensive discussions of the topics and issues raised here
(e.g. Ridley, 1983
; Lauder,
1981
,
1982
,
1990
;
Harvey and Pagel, 1991
;
Garland et al., 1992
,
1999
;
Garland and Adolph, 1994
;
Harvey, 1996
;
Ricklefs and Nealen, 1998
;
Ackerly, 1999
,
2000
,
2004
;
Pagel, 1999
;
Purvis and Webster, 1999
;
Diniz-Filho, 2000
;
Feder et al., 2000
;
Garland and Ives, 2000
;
Maddison and Maddison, 2000
;
Garland, 2001
;
Rohlf, 2001
;
Autumn et al., 2002
;
Blomberg and Garland, 2002
;
Brooks and McLennan, 2002
;
Blomberg et al., 2002, 2003
;
Rezende and Garland, 2003
;
Housworth et al., 2004
). We
have intentionally not cited some `forum' and `perspective' type papers
because we felt that their rhetoric was misleading, and in some cases they
contain outright errors.
The empirical examples cited here are idiosyncratic, reflecting mainly our
own research interests. Thus, we emphasize examples that involve physiological
phenotypes, but include others when they are lacking. Our enthusiasm for
phylogenetic approaches in comparative physiology should not be taken to
imply, however, that we think they are more important than other approaches,
such as measurement of selection acting in natural populations, experimental
evolution (e.g. see Garland and Carter,
1994
; Bennett and Lenski,
1999
; Ackerly et al.,
2000
; Feder et al.,
2000
; Garland,
2001
,
2003
;
Bennett, 2003
;
Swallow and Garland, 2005
),
or more purely mechanistic investigations (e.g.
Mangum and Hochachka, 1998
;
Hochachka and Somero,
2002
).
We are concerned that some of our discussion of assumptions and intricacies
of phylogenetically based statistical methods may be off-putting to those who
simply want to analyze their data (see also
Felsenstein, 1985
). However,
it must be acknowledged that statistical analyses in general are not always
simple and have underlying assumptions that cannot be ignored. Most of the
tools that we use in everyday research (e.g. correlation, regression, analysis
of variance, analysis of covariance) have been around for 50 years or even a
century. Nonetheless, the field of statistics (both theoretical and applied)
continues to refine these methods. Such questions as what type of line is best
for describing functional relationships (e.g.
Rayner, 1985
; chapter 6 in
Harvey and Pagel, 1991
;
Riska, 1991
;
McGuire, 2003
;
Garland et al., 2004
), how to
deal with non-linear relationships
(Quader et al., 2004
) or
random effects in ANOVA models, when to include or exclude interaction terms,
how best to transform data, or when to employ nonparametric methods, still do
not have simple, general answers. Moreover, new statistical methods continue
to be developed, including computer-intensive approaches that were not
possible 50 years ago (e.g. see Lapointe
and Garland, 2001
; Roff, in
press
). For many statistical parameters, including comparative
methodologies, several different approaches (and attendant algorithms) may be
used for estimation, none of which performs `best' in all situations. We
believe that it is important that a comparative biologist understand the
assumptions and approaches underlying these methodologies, and does not just
resort to their rote application, and that is the basis for our more detailed
presentation.
| Phylogeny and modern (statistical) comparative methods |
|---|
|
|
|---|
|
|
Concern about the possible influence of phylogeny in comparative and
ecological physiology antedated Felsenstein's
(1985
) publication. For
example, explicit comparisons of marsupial with placental mammals
(MacMillen and Nelson, 1969
;
Dawson and Hulbert, 1970
) and
of passerine with non-passerine birds
(Lasiewski and Dawson, 1967
)
were motivated by cognizance of phylogeny, and some workers tried to partition
the effects of phylogeny on physiological relationships (e.g.
Andrews and Pough, 1985
).
Moreover, some workers voiced concerns about specific adaptive interpretations
of characters shared more widely in their clades (e.g.
Dawson and Schmidt-Nielsen,
1964
; Dawson et al.,
1977
). What those earlier studies lacked was not necessarily a
general perspective on the importance of phylogeny, but rather a formal
logical and statistical methodology for incorporating detailed phylogenetic
information. Analytical techniques have been greatly expanded and modified
since 1985 (see below and Appendix), but Felsenstein's IC method is still the
most widely used and his insights were pivotal to modernization of the
comparative method. Moreover, the realization that IC is a special case of
generalized least-squares (GLS) methods (see Appendix) means that the former
can always serve as a useful entry point for the latter, and one that retains
the major heuristic of `tree thinking' (sensu
Maddison and Maddison,
2000
).
Traditional interspecific comparative analyses applied conventional
statistical methods to test for associations between traits (e.g. metabolic
rate and body size), or between a trait and an environmental variable (e.g.
blood oxygen carrying capacity and altitude). This approach treats all data
points (e.g. mean values for a series of species) as statistically independent
of each other. Unfortunately, mean phenotypes of biological taxa usually will
not be statistically independent because they are all related through their
hierarchical phylogenetic history. Empirically, more closely related species
do indeed tend to resemble one another; put simply, hummingbirds look like
hummingbirds, and turtles look like turtles, and the same is true for
physiological traits (Blomberg et al.,
2003
; see below). This general tendency exists for several good
biological reasons (Harvey and Pagel,
1991
), including time lags for change to occur after speciation,
occupation of similar niches by close relatives, and conservative
phenotype-dependent responses to selection. Thus, the extent of these
phylogenetic relationships and hence the expected degree of
resemblance must also be figured into comparative analyses. Analytical
techniques that do not incorporate phylogenetic information make the tacit
statistical assumption that all the species studied are equally distantly
related to each other, that is, that they descended along a `star phylogeny'
(Fig. 3A), when in fact their
ancestral associations are hierarchical
(Fig. 3C).
|
Second, it is important to consider what is meant by the `branch lengths'
of a phylogenetic tree that is used for analysis. In general, proponents of
phylogenetically based comparative methods assume that analyses of
physiological and other traits will involve use of a phylogenetic tree that
was inferred from other data, such as variation in DNA sequences, which is
presumed to be independent of the data being analyzed. Otherwise, it seems
intuitively obvious that analyses may involve some circularity. However, this
is actually a complicated subject and beyond the scope of the present paper
(Felsenstein, 1985
;
de Queiroz, 2000
). Leaving
aside the general issue of having available a phylogeny that is independent of
the characters under study, the branch lengths of the working phylogenetic
tree are confounded with the model and rates of character evolution that will
be assumed for statistical analyses of most real data sets (see Figs
1,
2). In other words, we usually
do not have independent information on, for instance, divergence times
and selective regimes that may have prevailed along various branches
of the tree. In any case, all of the main phylogenetically based statistical
methods require branch lengths in units proportional to expected variance of
evolution for the characters(s) under study (see Felsenstein,
1985
,
1988
; Garland et al.,
1992
,
1993
,
1999
;
Garland and Ives, 2000
;
Rohlf, 2001
;
Blomberg et al., 2003
;
Housworth et al., 2004
).
Branch lengths essentially indicate our a priori expectations for how
likely a given trait was to change (increase or decrease in value) from one
node to another along a phylogenetic tree, and thus become an integral
component of our statistical null model. Under a simple Brownian motion model,
those branch lengths would necessarily be proportional to divergence times.
Under any other model, such as the OrnsteinUhlenbeck (OU) process,
which is like Brownian motion while tethered to an elastic band and is used to
model stabilizing selection or constraints on trait space
(Felsenstein, 1988
;
Garland et al., 1993
;
Diaz-Uriarte and Garland,
1996
; Martins and Hansen,
1997
; Blomberg et al.,
2003
; Freckleton et al.,
2003
; Butler and King,
2004
; Housworth et al.,
2004
), they would be more-or-less different from divergence
times.
A simple hypothetical example can illustrate this distinction. Many traits
evolve within limits set by physical or biological properties. Some of these
are trivial. For example, body mass cannot evolve to be as small as 0 g.
Others are more interesting. Apparently, for example, activity body
temperatures (Tb) of squamate reptiles (lizards and
snakes) cannot evolve to be more than about 42°C. We do not know the
ancestral activity Tb of squamates, but it was probably
substantially lower that 42°C. Thus, during their initial radiation and
diversification, Tb would have been free to evolve,
perhaps in a fairly Brownian motion-like fashion, with an increase or decrease
about equally likely to occur along any branch of the phylogeny. However,
lineages that `explored' the climate space towards higher
Tb would eventually be constrained by the reduction in
Darwinian fitness that can be caused by exceedingly high temperatures (e.g.
via failure of spermatogenesis or outright death). Thus, if we were
to depict a phylogenetic tree of squamates with branch lengths proportional to
expected variance of Tb evolution, then we would need to
know the Tb at the start of each branch segment and also
have the branches be, in effect, different if the lineage was near a thermal
limit, either upper or lower. That is, a lineage near an upper limit would
have a low probability of evolving a higher Tb, but a
`typical' probability of evolving a lower Tb, and vice
versa. It should be obvious that our ability to specify such detailed
branch-length information for any trait in any group of wild organisms is
severely limited. Thus, for simplicity and/or analytical tractability,
phylogenetically based statistical methods usually begin with an assumption of
Brownian motion evolution along whatever branch lengths are specified in a
working phylogeny (e.g. Fig.
1). And in many cases (e.g. see reviews of published studies in
Blomberg et al., 2003
;
Ashton, 2004a
), these will be
arbitrary values, such as setting all segments equal to unity in length or by
some other simple rule (e.g. Fig.
4B). In such cases, it is often prudent to perform computations
with more than one set of branches as a sensitivity analysis for the
conclusions (e.g. see Ashton,
2004b
; Hutcheon and Garland,
2004
; Laurin,
2004
). Similarly, some studies use multiple phylogenies
(topologies) (e.g. Bauwens et al.,
1995
; Symonds and Elgar,
2002
; Hodges,
2004
).
|
These points have suggested to some that phylogenetically based analyses
are so fraught with pitfalls that we should stick with non-phylogenetic ones.
But a conventional statistical analysis actually has as many assumptions as a
phylogenetic one. For example, it assumes that the species under analysis have
not been interacting, e.g. as by character displacement
(Hansen et al., 2000
). It
assumes that each species should be equally weighted, which is equivalent to
saying that the heights of each branch from the root of the tree (assumed to
be a star) are equal. And so forth.
In any case, it has become increasingly clear that, because we never know
the true branch lengths and/or model of character evolution, we should pay
careful attention to the branch lengths used, employing methods that can
consider options ranging between a star and our working hierarchical
phylogeny, and possibly something even more hierarchical. Thus, recent methods
emphasize estimation of optimal branch length transformations as an essential
part of phylogenetic analyses of comparative data (e.g. see
Grafen, 1989
; Diaz-Uriarte and
Garland, 1996
,
1998
;
Pagel, 1999
;
Harvey and Rambaut, 2000
;
Freckleton et al., 2002
;
Martins et al., 2002
;
Blomberg et al., 2003
;
Housworth et al., 2004
).
Although some researchers may be uneasy with such transformations of branch
lengths, they are analogous to use of a BoxCox procedure to find the
optimal transformation of data (e.g. best approximation of normality) in
conventional statistical procedures (for instance, use of a BoxCox
procedure to transform branch lengths;
Reynolds and Lee, 1996
).
Moreover, aside from its benefits with computer-simulated data, such careful
attention to branch lengths can sometimes improve statistical power to an
important extent with real data (see below).
| An example of how phylogeny can affect statistical analyses |
|---|
|
|
|---|
The answer depends on what is assumed about the phylogenetic relationships
of the 12 species. If we assume that species are unrelated, then we can refer
to conventional tables of critical values for correlation coefficients. For a
one-tailed test with 12 data points (and hence 10 degrees of freedom for
testing a correlation), the critical value is +0.497, so a value of +0.585
would be considered significant at P<0.025. If, instead, we want
to assume that the species are related in a hierarchical fashion, then we
cannot use the conventional tables. Fortunately, however, we can incorporate
phylogenetic information as follows
(Martins and Garland, 1991
;
Garland et al., 1993
). We can
construct different working phylogenies, model the uncorrelated evolution of
these traits by a Monte Carlo computer simulation that assumes random,
Brownian-motion like trait change, and calculate a correlation coefficient for
each simulated data set. We can then determine the critical 5% level for the
correlation coefficient for each distribution. If we assume that all species
are completely independent or related by a star phylogeny
(Fig. 4A), then the one-tailed
probability for obtaining a correlation as large as +0.585 is 0.023 (based on
this particular set of 1000 simulated data sets), so the relationship is
statistically significant at P<0.05. In fact, if we do a very
large number of simulations, then we will obtain exactly the same
results as when referring to conventional tables.
If, however, we simulate data along our best estimate of the phylogeny of these lizards (Fig. 4C), then a correlation of +0.585 would be observed much more frequently than 5% of the time and would not be considered very unusual, hence not statistically significant (P>0.15). If a hypothetical phylogeny with different branch lengths, involving fewer deep roots, were assumed (Fig. 4B), then a value of +0.585 would have a lower probability of being observed, but in this case would still be non-significant. Thus, the assumed pattern of the relationships among the species crucially affects the statistical significance of the observations: the more the phylogeny departs from a star, the lower is the number of effectively independent observations and the more likely we are to observe an extremely large (or small) correlation just by chance. If the working topology, branch lengths, and simulation model are somewhat realistic, then we will claim significance too often if we ignore phylogeny.
Another important point is that if the simulated data of Fig. 4B or C are analyzed with IC, using the corresponding phylogenies, then the resulting distribution of correlation coefficients will be the same as in Fig. 4A (results not shown). Thus, as discussed in the Appendix, the IC method uses the specified phylogenetic information to transform the data to make them independent and identically distributed. This then allows one to refer to conventional tables of critical values for hypothesis testing.
All of the simulations shown in Fig.
4 were done under a simple Brownian motion model
(Fig. 2), but the model used
can have a large effect on the resulting distributions of statistics (e.g. see
Garland et al., 1993
;
Diaz-Uriarte and Garland,
1996
; Price,
1997
; Harvey and Rambaut,
2000
; Martins et al.,
2002
; Freckleton et al.,
2003
). Brownian motion is a very simple model of character
evolution, and its analytical tractability was exploited by Felsenstein
(1985
) to develop IC. It is a
good model for traits that evolve solely by random genetic drift, and may also
be adequate for some types of `fluctuating' selection (i.e. when the direction
of selection changes from generation to generation). As a basis for
statistical methods to estimate and test character correlations, it may also
be an adequate model for traits that are subject to certain types of selection
(Felsenstein, 1985
,
1988
;
Grafen, 1989
). But most traits
probably evolve in ways that are too complicated and idiosyncratic to be
modeled realistically by Brownian motion
(Felsenstein, 1988
;
Hansen et al., 2000
).
Fortunately, simulations can use arbitrarily complex models of character
evolution, limited only by one's ability to write computer programs and
imagination (e.g. Garland et al.,
1993
; Diaz-Uriarte and Garland,
1996
,
1998
;
Price, 1997
;
Harvey and Rambaut, 2000
;
Freckleton et al., 2003
). Of
course, whether more complicated models lead to more accurate analyses depends
on whether they are actually a better descriptor of past evolution by the
characters under study, and that is something very difficult to know.
Moreover, finding a model that fits a set of data reasonably well does not
necessarily mean that it is the correct model, and other models can probably
be found that would provide equally good fit (see also
Blomberg et al., 2003
). (As
always, it is risky to attempt to infer process from pattern.) Given the near
impossibility of knowing how traits actually evolved in the distant past,
simulation studies are also used to gauge how robust analytical methods such
as IC are to violation of their assumptions (e.g. Brownian motion, accurate
knowledge of branch lengths), how diagnostic tests can alert one to such
violations, and how well remedial measures (e.g. transformations of tip data
or branch lengths) can rescue statistical performance when assumptions are
violated (e.g. Martins and Garland,
1991
; Purvis et al.,
1994
; Diaz-Uriarte and Garland,
1996
,
1998
;
Harvey and Rambaut, 2000
;
Diniz-Filho and Torres, 2002
;
Freckleton et al., 2003
).
Still, physiologists may sometimes be able to improve the accuracy of assumed
models by their knowledge of how organisms work (or could have worked), as in
the case of limits to body temperature evolution discussed above. Similarly,
paleontological information can be used to improve the realism of simulations
(Garland et al., 1993
).
We close this section by emphasizing that the use of computer simulations
to obtain `phylogenetically correct' (PC) null distributions for testing
hypotheses about comparative data is a very general tool that can be used for
virtually any analysis (Martins and
Garland, 1991
; Garland et al.,
1993
), including bivariate (e.g.
Ricklefs and Nealen, 1998
) or
multivariate analyses of evolutionary diversification. For example, the
PHYLOGR (available at
http://cran.r-project.org/)
program allows one to test hypotheses about canonical correlation or principal
components analysis (PCA) in relation to computer-simulated data (R.
Diaz-Uriarte and T. Garland, manuscript in preparation).
| When and why to use phylogenetic information in comparative studies |
|---|
|
|
|---|
What kinds of characters demand a phylogenetic analysis? Although we may
generally expect that most characters will tend to `follow phylogeny', this is
an empirical question. The simplest general test for whether related organisms
actually do tend to resemble each other more than they resemble those that
might be chosen randomly with respect to phylogenetic position uses
randomization procedures (see also Abouheif, 1994;
Ackerly, 2004
;
Laurin, 2004
;
Rheindt et al., 2004
).
Specifically, once phylogenetically IC have been computed, it is possible to
calculate the variance of those contrasts. The lower the variance of the
contrasts, the better the fit of the phylogeny (topology and branch lengths)
to the character in question. To determine whether a given variance indicates
the presence of statistically significant `phylogenetic signal' (i.e. more
closely related species tend to resemble each other more than they resemble
randomly chosen species), one can compare it with the distribution of
variances for a large number of data sets that have been randomized (shuffled)
across the tips of the phylogeny (Blomberg
and Garland, 2002
; Blomberg et
al., 2003
). For studies with 20 or more species (for which
statistical power should be reasonably high), more than 90% of the traits
examined to date (including behavioral, physiological, morphological, life
history and ecological/environmental traits) do exhibit a significant
phylogenetic signal (P<0.05:
Blomberg et al., 2003
;
Al-kahtani et al., 2004
;
Ashton,
2004a
,b
;
Rezende et al., 2004
;
Ross et al., 2004
;
Muñoz-Garcia and Williams, in
press
; see also Freckleton et
al., 2002
).
The empirical finding of pervasive phylogenetic signal implies that
hierarchical phylogenies as presented and used in numerous
publications provide a better fit to the data under analysis than does
a star phylogeny (Figs 3A,
4A). This sends a strong
message that we should routinely consider phylogenetic information in
statistical analyses of comparative data. However, this does not necessarily
mean that, for any given set of data, we should simply obtain a phylogenetic
tree, perform an IC, GLS or Monte Carlo simulation analysis, and automatically
presume that the results will be more reliable than the comparable
conventional statistical analysis. As we and others have emphasized for more
than a decade, analyses using a given topology and branch lengths can perform
relatively poorly if their assumptions are severely violated (e.g. see
Grafen, 1989
;
Martins and Garland, 1991
;
Diaz-Uriarte and Garland,
1996
,
1998
;
Price, 1997
;
Garland and Diaz-Uriarte,
1999
; Harvey and Rambaut,
2000
; Diniz-Filho and Torres,
2002
; Martins et al.,
2002
; Freckleton et al.,
2003
; Housworth et al.,
2004
). Thus, we urge practitioners to apply robust tests for
phylogenetic signal, diagnostic checks, and branch length transformations as
warranted (for recent discussions and methods, see
Freckleton et al., 2002
;
Blomberg et al., 2003
), and
sensitivity analyses by varying branch lengths and/or model of evolution (e.g.
see Garland et al., 1993
;
Ashton, 2004b
;
Hutcheon and Garland, 2004
;
Laurin, 2004
;
Muñoz-Garcia and Williams, in
press
).
What sorts of branch lengths should be used? Given the uncertainties
regarding branch lengths (see above), many workers have reported results with
multiple branch lengths to explore consistency or the lack thereof (e.g.
Ross et al., 2004
). Although
it is often the case that conclusions are relatively robust (insensitive) to
the branch lengths used, this is not always true, and the importance of
attempting to use `optimal' branch lengths transformations can be illustrated
with two empirical examples. Garland et al.
(1993
) analyzed home range
areas in relation to body size for 49 species of carnivores and ungulates.
Conventional analysis of covariance (ANCOVA) indicated a highly significant
(P<0.001) different in size-adjusted home range areas of the two
groups. Analysis via IC (or Monte Carlo simulations), however,
revealed no statistically significant difference (two-tailed P=0.126
for IC). The branch lengths used for the phylogenetic analyses were estimates
of divergence times, derived from various sources. They passed the diagnostic
`lack of fit' tests as described in Garland et al.
(1992
). However, power to
detect a difference is apparently improved by applying the transformations of
branch lengths as proposed by Blomberg et al.
(2003
) to mimic particular
models of character evolution. Using the branches transformed under the OU
model for log body mass, the P value is reduced to 0.099, and using
their AcceleratingDecelerating (ACDC) model the P value is
reduced to 0.044, thus crossing the typical threshold of <0.05 to be
considered statistically significant (degrees of freedom were reduced by one
in both cases to reflect the additional parameter estimated in these models;
see also Diaz-Uriarte and Garland,
1996
,
1998
;
Garland and Diaz-Uriarte,
1999
). Similarly, in a recent comparison of the generic average
body sizes of `megabats' and `microbats,' Hutcheon and Garland
(2004
) found statistical
significance in the IC analysis only when using branch lengths transformed
under the OU or ACDC models. We suspect that such increases in power may be
more likely to occur in comparisons between groups that are fairly highly
phylogenetically confounded (i.e. the independent variable of interest, such
as diet, is highly clumped with respect to phylogeny, as in comparisons of
clades; e.g. see Garland et al.,
1993
; Vanhooydonck and Van
Damme, 1999
; Perry and
Garland, 2002
; Rezende et
al., 2004
) than in tests of correlations between two traits.
How do we choose species for study? Traditionally, animals were chosen for
comparative studies for any number of reasons, including convenience (e.g.
local availability or an existing literature data base), possession of an
interesting biological trait (e.g. the long neck of the giraffe), occupation
of an extreme environment (e.g. a hot dry desert, the Arctic), or
characteristics that make it well suited to study a particular physiological
process (Garland and Adolph,
1994
; Garland and Carter,
1994
; Bennett,
2003
). Frequently, a particular species or group living in a
particular environment has been the key that originally sparked interest in
the project. It is now clear that phylogenetic information should also be
considered when choosing species for study (for a simulation study on the
effects of taxon sampling when testing for correlated evolution, see also
Purvis and Webster, 1999
;
Ackerly, 2000
).
To increase analytical power, it is a good idea to include other species
that experience a very broad range of the environmental (`independent')
variable. You might then randomly sample species from a broad taxon (e.g.
mammals) or focus exclusively on a particular lineage, such as bats or
rodents. From a design perspective, the latter strategy is preferable because
the broader comparison will involve distant relatives that vary in many
traits, potentially complicating the analysis of particular traits of
interest. From a phylogenetic point of view, comparisons of distant relatives
are like an experiment with multiple uncontrolled variables
(Garland and Adolph, 1994
;
Garland, 2001
). To quote
Felsenstein (1985
, p. 465),
`Comparative biologists tend to suspect comparisons of distantly related
species; they hope to base their comparisons on recent evolutionary events
that have not been overlaid by much subsequent change'. In principle, it
might be possible to control for confounding traits that differ in distant
relatives by including additional independent variables in the analysis, but
it is often difficult to know a priori what those traits might be,
let alone actually obtain quantitative data for them. In any case, an example
in which casting too broad a net seems to reduce statistical power is provided
by a study of body mass evolution in birds
(Garland and Ives, 2000
, p.
354). A comparison of passerines with their sister clade indicates that the
former have significantly smaller log body masses, on average, whereas a
comparison of passerines with all birds (including their sister clade) does
not. (It should be noted that the identity of the sister clade of passerines
is controversial, and the foregoing example may well change as improved
phylogenetic information becomes available.) A related topic is whether one
might a priori exclude certain subclades from a comparative analysis
because they are `unusual' as compared with the larger clade in general. For
example, many studies of lizards (e.g.
Perry and Garland, 2002
)
exclude snakes. A recent study by Bininda-Emonds and Gittleman
(2000
) suggests that this sort
of a priori data exclusion may be less warranted than is often
presumed.
A particularly powerful comparative design is one that has several
different pairs of closely related species that differ in the variable of
interest (e.g. high and low temperature) and has these species pairs
relatively distantly related to each other (i.e. in different branches of the
phylogeny). As noted by Garland
(2001
), a particularly
favorable distribution of this sort has the power to detect significant
associations even when conventional statistical methods fail to do so.
However, some workers choose species in this way, but then analyze only the
pairs of tip species rather than performing a full analysis of the entire
phylogeny (e.g. Lavergne et al.,
2004
). If that is done, Type I error rates should be correct, and
the analysis should be robust with respect to errors in branch lengths and/or
model of evolution, but statistical power will likely be lost (see
Ackerly, 2000
). A more extreme
analytical variant is to perform a sign test on the tip pairs
(Felsenstein, 1985
), thus not
using any information on branch lengths, but this comes at the extreme loss of
statistical power (Ackerly,
2000
).
The worst, that is, the least powerful, comparative design is one in which
all species on one side of the root of the tree share, say, high values for an
independent variable of interest (e.g. high temperature) and those on the
other side of the root share low values (e.g. low temperature) (e.g.
Garland et al., 1993
;
Garland, 2001
). Although some
methods can enhance inferential power in such situations (e.g.
Schondube et al., 2001
), it
is not an attractive comparative scenario.
How many species or other taxa need to be included in a comparative study?
In general, the statistical power of phylogenetically based analyses, when
applied with an accurate phylogeny and model of character evolution, is the
same as for conventional statistical methods, so standard power calculations
can be employed (e.g. see fig. 5 in
Garland and Adolph, 1994
).
However, it is also true that phylogenetic analyses sometimes uncover
relationships that were not apparent in conventional analyses (see below).
| Examples of the utility of incorporating a phylogenetic perspective |
|---|
|
|
|---|
We will now review just a few examples from the literature in which phylogenetic information has added to our understanding and interpretation of comparative data. The first two of these deal with the evolution of lower metabolic rate in endotherms as part of their adaptation to desert environments.
It has long been recognized that low metabolic rates, low body
temperatures, and an ability to become torpid would be beneficial to
endotherms in hot, arid environments, in order to minimize heat load and
energy (and water) demands in environments of low productivity (e.g.
Dawson and Bartholomew, 1968
;
Dawson and Hudson, 1970
;
Williams, 1996
;
Tieleman et al., 2003
;
Rezende et al., 2004
). When
these traits were first discovered in desert caprimulgid birds (e.g. poorwills
and nighthawks), they were initially interpreted as adaptations to desert
conditions (e.g. Bartholomew et al.,
1962