Welcome to our new website

Pathways to understanding the extended phenotype of parasites in their hosts
David Hughes


The study of the adaptive manipulation of animal behavior by parasites is entering very exciting times. Collectively the field has moved from its important and instructional natural history phase into proximate-level studies aiming to elucidate the mechanisms by which one organism controls another. Because many cases studies involve cross-kingdom control of behaviour, the findings are sure to be exciting. In this review I examine what possible pathways we can take to understanding the controlling behavior of parasites and how host behavior has become an extended phenotype of the parasites that is often hidden from view.


The study of animal behavior has embraced the use of new technologies to understand gene expression and how behavioral phenotypes are produced from the underlying genome (Stapley et al., 2010; Bell and Robinson, 2011). Such approaches will be especially useful for understanding how novel behaviors arise, which has, of course, been a perennial question in evolutionary biology (Darwin, 1859; Mayr, 1982). In some cases the behavior that we observe in animals is not due to the expression of their genes but rather to the genes of parasites infecting them. In such cases the behavior is an extended phenotype of the parasite (Dawkins, 1982; Dawkins, 1990; Dawkins, 2004; Hughes, 2008; Hughes et al., 2012). Beyond the obvious importance of explaining how such complex parasite adaptations evolve by natural selection, the study of behavioral manipulation is important because it represents a parallel experiment over evolutionary time. That is, natural selection has acted on the genome of both the parasite and the host to control a single phenotype (behavior in the host). Understanding diverse pathways from genes to phenotypes will help us tackle the important question in evolutionary biology: what is the mechanistic basis of animal behavior (Duckworth, 2009)? In this Review I explore some of the pathways that can lead us to a proximate level understanding of extended phenotypes.

A crucial detail of extended phenotypes is the distance over which they are extended. This distance can be phylogenetic, as occurs when the parasite and host are distantly related and commonly in different kingdoms [e.g. rabies virus changing mammal behavior (Moore, 2002)]. It is also sometimes a physical distance depending on where in the host’s body the parasite lives [e.g. the abdomen-dwelling hairworms of crickets causing changes in brain expression (Thomas et al., 2002)]. And finally, the distance can be temporal as gene expression of parasite genes may precede the resultant altered phenotype [parasitoids produce chemicals that manipulate insects to act as bodyguards after the wasp has emerged (e.g. Grosman et al., 2008, Maure et al., 2013)]. In spite of these complexities, the task of understanding the genetic basis of an extended phenotype is possible with the correct model system where the biological details are well known (Biron et al., 2005a; Lefèvre et al., 2009; Poulin, 2011; Adamo, 2012).

Most studies on parasite manipulation of host behavior have been descriptions of the phenomenon. Unusual behaviors observed in infected individuals are recorded, and if their complexity suggests that it benefits the transmission of parasite genes then the behavior is said to be an example of adaptive manipulation (Barnard and Behnke, 1990; Beckage, 1997; Moore, 2002). This approach is valuable but prone to criticism as adaptationist story-telling (Gould and Lewontin, 1979). Such charges prompted one of the major researchers in the field of parasite ecology to question the utility of the extended phenotype paradigm (Poulin, 2000). In response, over 30 authors debated in a special issue of Behavioral Processes (2005, Vol. 68, Issue 3) that despite problems with adaptationist reasoning, encouragement could be taken from the new studies looking at the mechanisms by which parasites control behavior (Hughes, 2005; Moore et al., 2005; Thomas et al., 2005; Adamo, 2012). This mechanistic approach has already been successful and demonstrated neurological and hormonal changes in infected hosts (Lefèvre et al., 2009; Adamo, 2012). One of the most promising approaches has been an inference of the genetic basis of parasite control via proteomics (Biron et al., 2005a; Biron et al., 2005b; Biron and Loxdale, 2013). Here, proteome profiles of hairworms (Nematomorpha, also called Gordian worms) causing crickets to jump into water (so that the worms can exit for mating) revealed a molecular cross-talk between the parasite inside the cricket’s abdomen and the brain of the cricket. Specifically, the worms caused an upregulation of cricket Wnt proteins in the brain (Biron et al., 2005b). These advances led to an invited multi-authored review on the mechanistic advances and a call for more detailed studies, including whole-genome analysis (Lefèvre et al., 2009). The field of parasite manipulation has therefore moved beyond its important natural history phase towards a more empirical approach: a recently edited volume for Oxford University Press records these interesting developments and the history leading this point (Hughes et al., 2012).

Although the proteomic basis is important (and the metabolomic basis too), the ultimate goal in studying the extended phenotypes of parasites is to determine the genetic basis. Recently, Hoover et al. (Hoover et al., 2011) were able to demonstrate a single gene effect of baculovirus responsible for an altered behavior (egt), in that case the well-known summit disease observable in virus-infected caterpillars (discussed below). A previous study highlighted the role of a single gene (Cory et al., 2004), but the Hoover et al. study is important for the broader field of parasite manipulation as it points to ways experimental studies (gene knockouts and restoration) will increasingly become part of our toolkit (Fig. 1). Such approaches raise a number of questions for researchers interested in the genetic basis of parasite extended phenotypes that I will discuss in this Review. What I aim to do is ask what different approaches can be taken for better elucidating the genetic evidence for behavioral change. First I will re-cap the concept of the extended phenotype.

The extended phenotype

The paradigm of the gene as the unit of selection emerged during a period of much debate between advocates of individual- and group-level selection and through the work of Bill Hamilton (Hamilton, 1963; Hamilton, 1964). This debate is still on-going and occasionally rancorous (Hughes, 2011). Hamilton’s concepts were subsequently made more transparent by Richard Dawkins in his selfish gene approach (Dawkins, 1976) and became the foundation for sociobiological theory (Wilson, 1975). What this paradigm states is that it is genes alone that are transferred between generations; the organisms in which genes reside and their phenotypes are the means by which transmission is secured. Organisms are vehicles and genes are replicators. Natural selection chooses among variation in phenotypes, but the information encoding these phenotypes and, ultimately, the unit that is selected is the gene [see discussion by Mayr (Mayr, 1997)].

The phenotype has principally been considered as a trait of the individual organism. Examples are eye or flower color, antler length, butterfly wing spots, behavior or chemical signals released into the air, to name just a few. But such foci only reflect the convenience with which we could study those easily visible attributes of organisms (Dawkins, 1990). Rapid and continued technological advances allow us to look at phenotypes all the way from the surface of the organism down to the levels of transcription and protein folding. Dawkins (Dawkins, 1982) also advocated an additional level of the phenotype, but what was and still remains novel is that this additional level of phenotype is not physically attached to the organisms whose genes are encoding it; this is the extended phenotype.

Fig. 1.

Schema showing the different pathways that can be used to understand parasite manipulation of host behavior. The banner shows the category into which each approach fits and the dotted lines show current boundaries between fields employing such approaches.

The first of the three extended phenotypes to be considered was animal architecture. The classic example is the beaver dam, which is a physical representation of beaver behavior that increases the fitness of the genes encoding the building behavior. The second extended phenotype is parasite manipulation of host behavior. That is the topic of this special issue and has been reviewed comprehensively by Janice Moore (Moore, 2002) and recently by Hughes et al. (Hughes et al., 2012). As mentioned above, a classic exemplar of this field is the suicidal behavior of crickets infected by hairworms, whereby they jump into water so the adult worm can impressively exit from the thrashing body of its drowning host (Thomas et al., 2002). This behavior is controlled by parasite, and not host, genes (Biron et al., 2006). The third and final extended phenotype is action at a distance. An example is the manipulation of host behavior by cuckoo chicks. In this case the chick is not physically associated with the host, as in the case of hairworms, but nonetheless influences the expression of its behavioral phenotype.

The choice of systems

Before discussing the different approaches to examining the genetic basis of extended phenotypes it is necessary to discuss the merits of one system over another. There is no ideal system with which to work but rather a diversity of systems that come with both advantages and disadvantages. I have worked on strepsipteran parasites manipulating wasp and ant behavior (Hughes et al., 2003; Hughes et al., 2004) and had the good fortune to work collaboratively on the baculovirus manipulation of caterpillars (Hoover et al., 2011) and hairworm manipulation of crickets (Ponton et al., 2006), and in my laboratory we are beginning to work on nematode manipulation of ants (D.H., unpublished). A few years ago I began working on fungal manipulation of ant behavior due to the inherent appeal of the system as I became interested in elucidating the genetic basis of manipulation, specifically the tractable, but nonetheless highly complex, manipulation of ant behavior by a fungal parasite, Ophiocordyceps unilateralis s.l. In the manipulative event, worker ants are fully controlled by the fungal cells within them: they leave the nest and, depending on whether the nest is on the ground or in the forest canopy, infected ants ascend or descend to understory plants where they bite onto vegetation before dying. This ‘death grip’ functions to hold the ant in place while the fungus kills it with chemicals (Hughes et al., 2011). The grip is sufficiently tight to prevent ants falling from the post-mortem perch. Between 2 and 3 days following host death the fungus grows mycelia from the ant’s feet that stitch the ant to the plant (Andersen et al., 2009). A large stalk (clava) then grows from the head. An ascoma grows on one side of this stalk (hence the species epithet, unilateralis) and ascospores are shot out to infect new ants. Spores attach to foraging workers and grow through the cuticle using enzymes, and the cycle continues (Roy et al., 2006).

The behavioral manipulation is adaptive to the fungus, which cannot grow or transmit inside the colony. It is not an altruistic act by the dying ant that serves to reduce infection to kin, because the death of the ant outside the colony leads to spore transmission to kin when they forage (Andersen et al., 2009). Our ecological work in rainforests in Thailand, Brazil, China and Australia and temperate woods in South Carolina has shown that ant cadavers can occur in high-density ‘graveyards’ with up to 26 in a single square meter (Pontoppidan et al., 2009) (D.H. et al., unpublished data).

The manipulative event differs geographically: in tropical rainforests it occurs mostly on the veins of major leaves while in temperate systems dying ants bite onto twigs of plants (Mains, 1958; Evans, 1982). The height, orientation and timing of the death grip can all be precisely controlled by the fungus. Recently we showed that worker ants in tropical forests were manipulated to bite the sub-axial veins of leaves that had a mean (±s.e.m.) height of 25±2 cm from the ground, facing N–NW (Andersen et al., 2009). Infected ants also bite synchronously at solar noon (Hughes et al., 2011). We were able to partially elucidate the mechanisms that involved targeted atrophy of ant mandible muscles.

At least in my research it has become apparent that some systems are better suited than others when the goal is to understand the proximate level. I chose a fungal parasite explicitly because fungi have small genomes relative to other eukaryotes and many genomes are available for a comparative approach; genome assemblies for over 100 different fungal species are available in GenBank, and the community is working towards a thousand fungal genomes (http://1000.fungalgenomes.org). Further, the medical, societal and agricultural importance of fungi ensures that techniques for their isolation, culturing and laboratory study are very far advanced compared with all other taxa of parasites that are known behavioral manipulators. Also, useful extensive phylogenetic work has been performed on fungi parasitic on arthropods (Spatafora et al., 2007; Sung et al., 2007b; Sung et al., 2007a). Finally, fungi are amenable to transformation, which will eventually allow for forward genetic approaches.

Comparing -omes

Here I contrast approaches that have become possible due to the revolution in three key areas. The first is next-generation sequencing that has led to developments in genomics, transcriptomics and our understanding of non-coding RNA. The second is mass sorting approaches to study small molecule sorting that has led to a greater understanding of the proteome, metabolome, peptidome and lipidome. And the final revolution has been the development of better conceptual frameworks and computational tools leading to an increased understanding of the epigenome, as well as the ability to carry out the developments mentioned above. In this section I will review some of these approaches and highlight their attraction as well as pitfalls. Naturally, with such as fast-paced field of research this cannot be an exhaustive review, nor can it be a how-to manual. There are continually new and improved statistical approaches being developed that integrate diverse data sets (Ament et al., 2012). The aim is simply to highlight the promise of discovery with the important caveat that many technical and computational issues remain to be resolved.

Comparative genomics

An obvious first approach for a researcher interested in the genetic basis of manipulated behaviors would be to sequence the genomes of parasites that manipulate and do not manipulate their hosts and ask what is different between the two. This has not yet been done for any parasite known to affect behavior but it has intrinsic appeal due to the commonness of such an approach for understanding the evolution of parasite adaptations and virulence. For example, the use of a 36 whole-genome platform was successfully used to identify pathogenicity factors in fungi infecting plants (Soanes et al., 2008). In that case the organisms were often very widely dispersed across the fungal kingdom. One can also compare more widely as in the case of parasitic fungi and pseudo-fungi, Phytophthora, leading to insights that virulence factors in one were horizontally inherited across kingdoms (Richards et al., 2011). But by far the greatest power is when closely related species are compared to reduce the differences due to separate evolutionary histories. Again, staying with fungal parasites as examples, the comparative genomic approach has been employed very successfully. A comparative survey of eight Candida genomes revealed a suite of virulence factors that would not have been obvious comparing across higher taxonomic levels (Butler et al., 2009). The approach is not limited to fungi of course, and comparative genomics has been very useful in other groups such as bacteria, where it has been used to identify key changes such as antibiotic resistance mechanisms (Palmer and Gilmore, 2010; Palmer et al., 2010).

The next step for our community then is generating whole genomes of organisms known to manipulate behavior and comparing these against parasites that do not manipulate. Currently, my laboratory is sequencing two genomes of fungi in the genus Ophiocordyceps that manipulate behavior, with the view to compare these against other parasite genomes already released (Gao et al., 2011; Zheng et al., 2011). A cautionary note is needed here: although sequencing is extremely cheap nowadays, it remains a major challenge to successfully annotate genomes and then build a platform for comparative analysis. Crucial to our success in the comparative genomics of adaptive manipulation will be an integrative approach.


Direct sequencing of RNA to generate expressed sequence tags aids in accurate genome annotation, by allowing fine-scale definition of gene structure and estimating gene expression levels under different conditions. Even in species with well-characterized genes, gene prediction tools are imprecise, so evidence provided by expressed sequences is useful. One can generate cDNA libraries and strand-specific libraries if sufficient RNA can be prepared (Parkhomchuk et al., 2009; Yassour et al., 2009). RNA sequencing also provides a direct measure of the level of gene expression, as read counts can be used to estimate transcript abundance. Several methods exist to use RNA-seq data to identify differentially expressed genes. Two methods, EdgeR (Robinson et al., 2010) and DESeq (Anders and Huber, 2010), assume read counts follow a negative binomial distribution; the estimation of variance differs between the two methods, where DESeq may better handle noise. A Cufflinks accessory program, cuffdiff, can also measure differential expression. All these methods use actual read counts; other methods such as the common RPKM (reads per kilobase of exon sequence mapped) are normalized based on mapped read numbers and the resulting ratios are less useful for statistical analysis.

There are significant issues with transcriptomics that revolve around the issue of statistical analysis, bias and the correct use of technical versus biological replication (Subramaniam and Hsiao, 2012). Variation in transcript length between different genes can bias estimates of differential expression (Oshlack and Wakefield, 2009); for two genes that show the same fold change, the longer gene will be more statistically significant as it is supported by more reads. Normalized read counts between samples can reduce the gene length bias, for example (Robinson and Oshlack, 2010). Currently, significant attention is being focused on resolving these issues, which will be of use to those of us interested in the genetic basis of manipulation. Ultimately, however, the greatest advances will come where transcriptomics is employed for systems where host and parasite genomes are available.


The proteome (and metabolome) are downstream from both the genome and the transcriptome (Fig. 1). All four offer different viewpoints of the interaction between host and parasite. The proteome is an immediate view to the function of the genome – i.e. what biologically active proteins (e.g. enzymes) are being produced. Because of the proteomes utility it was this level that was first examined by researchers wanting to understand the chemicals parasites produce when controlling behavior. Leading this research was the expert proteomic analyst, David Biron, who successfully and effectively opened some very impressive avenues of research (Biron et al., 2005a; Biron and Loxdale, 2013). A fuller account of the approach to proteomics, the limitations and the promises can be gained from David Biron’s paper in this issue.


This is largely a variant of proteomics, though not all metabolites are proteins (see Biron and Loxdale, 2012). Global metabolomics and targeted metabolite profiling of known compounds will be important for understanding how parasites control behavior, and the approach is complementary to proteomics. At the required time points, extracts are profiled using time-of-flight and mass spectrometry. Accurate mass measurements and structural elucidation is then possible by tandem mass spectrometry fragmentation patterns.

Additionally, targeted searches are possible. Consider again the manipulation of ant behavior by fungi. Behavioral manipulation and muscular atrophy of ant muscles by fungi results in significantly reduced mitochondria and sarcoplasmic reticula inside muscle cells (Hughes et al., 2011). This distinct atrophy of ant mandibles suggests that partial denervation occurs whereby motor neuron firing is affected. Metabolomic studies of the group of fungi to which O. unilateralis s.l. belongs have already revealed the production of NMDA agonists (ketamine) that are known neuromodulators. Furthermore, the larger groups to which these fungi belong are the original source of the compound that led to the production of LSD (Isaka et al., 2005; Molnár et al., 2010). In this instance, one could profile known secreted metabolites including secondary metabolites with polyketide/fatty acyl and amino acid-derived moieties and polyketide–nonribosomal peptide hybrid using a triplequadrupole platform (e.g. Waters XEVO TQS). For the global metabolomics study, important, discriminating metabolites can be identified using a suite of data visualization techniques (principal components analysis) that are useful for high-dimensional data sets (Patterson et al., 2009; Patterson et al., 2010).


Character trait mapping onto parasite phylogenies

There has been a dramatic increase in the number of well-supported molecular phylogenies available. When one says ‘well supported’, what that implies is a phylogeny constructed with multiple genes (five to eight being a good rule of thumb), with very wide taxon sampling and with the requisite tree-building testing using complementary approaches such as neighbor joining and maximum parsimony. Arriving at a well-supported, robust tree is again an example of challenging bioinformatics when the sampling is broad. This is another example, like comparative genomics, where collaborations with others capable of handling large data sets are required. In all cases, a molecular phylogeny is a reconstruction of what the hypothesized evolutionary relationships are among different organisms. More recently there has been a move to increase the available information by phylogenomics (e.g. expressed sequence tags) and using other markers such as microRNAs (Campbell et al., 2011). All of these approaches are useful in that they provide a foundation for character trait mapping.

Once a robust phylogeny has been produced, then researchers interested in parasite traits such as the mode of exploitation and, in particular, whether exploitation involves manipulation can begin mapping the traits onto the phylogeny. This has been a very useful approach in reconstructing the evolutionary pathways to a wide range of traits in non-parasitic organisms (Avise, 2006). In parasites, the use of phylogenies to infer the pathways to behavioral changes is less common (Poulin, 1994b; Poulin, 2011).

Character trait mapping onto host phylogenies

Behavioral changes are often adaptive to parasites and the product of natural selection. Poulin (Poulin, 1994b; Poulin, 1994a) also noted that a phylogeny is a useful tool (see Comparing -omes, Proteomics). Poulin argued that phylogenies can be derived from host or parasite genes and the trait (behavioral change) mapped on (Poulin, 1994b). As more phylogenetic data on both host and parasite taxa become available, we will begin to see character trait mapping onto host phylogenies compared with character trait mapping onto parasite phylogenies. In the former case the phylogeny would contain non-parasitic organisms that are either uninfected by the parasite group under study, infected but non-manipulated or infected and manipulated. Such phylogenies can be rightly considered extended phenotype phylogenies, which allows us to understand what aspects of the parasite’s environment (namely the physiology and ecology of the host) permit the evolution of manipulative strategies. Such an approach can even be conducted experimentally, as Moore and Gotelli did when they infected seven species of cockroach with the acanthocephalan Moniliformis moniliformis (Moore and Gotelli, 1996).

Experimental approaches

In the last section I mentioned some nice experiments by Moore and Gotelli (Moore and Gotelli, 1996) where intermediate hosts were infected by an acanthocephalan capable of controlling behavior in one species to determine whether it could control behavior in another. When genomes are available, then other experimental approaches are possible, such as reverse and forward genetics.

Reverse and forward genetics

We understand evolution by natural selection by examining variation. To understand the genetic basis of phenotypes, we can discover phenotypes of interest and then query the underlying genome to ask what genes cause the phenotypes. This is forward genetics and involves performing screens of the genome to identify variations in sequences that correlate with variations in the phenotype. Classical forward genetics involves breeding experiments, and the process of screening can also be accelerated by using mutagenic compounds that affect the genome and the phenotype. A more directed method is reverse genetics, which targets a gene of unknown function using a range of available tools such as insertional/random mutagenesis, RNAi silencing or interference using transgenes that overexpress their product.

A promising example of the experimental approach is the recent work by Hoover et al. (Hoover et al., 2011), which knocked-out a gene in a baculorvirus (Lymantria dispar nucleopolyhedrovirus) implicated to play a role in the summit disease behavior of infected gypsy moth caterpillars (also called tree top disease). The behavioral control places caterpillars high on vegetation where, upon being killed by the virus, they can shed millions of virions onto leaves and continue transmission. The researchers hypothesized that expression of the baculovirus gene ecdysteroid uridine 5′-diphosphate (UDP)glucosyltransferase (egt) was responsible. The egt gene was silenced by two recombinants by either the β-galactosidase gene (LacZ) or the human transferrin gene (htf). Elimination of egt led to the loss of the behavioral manipulation and, cleverly, a rescue of manipulation when the gene was inserted (Hoover et al., 2011). Obviously not all examples of manipulation will turn out to be controlled by a single gene, but this experimental approach offers great potential for a wide range of other studies.


We have a fabulous opportunity to not only advance our own understanding of the co-evolutionary dynamics between animals and the parasites that manipulate them, but also contribute to evolutionary biology more generally. Researchers studying parasites and behavior are often experts in life histories, diversity and the very particular circumstances in which we see manipulative behaviors being expressed. We know what normal behavior is because our focus is on the abnormal that comes about as a consequence of infection. An understanding of behavior and the natural history of host–parasite interactions might not seem an advantage given the enormous advances in certain areas of biology driven by technological and computational advances. But I would argue, and strongly, that the massive focus on proximate-level analysis since the discovery of the double helix has affected our collective appreciation of complexity, such as that involving behavioral changes at the host–parasite boundary.

Those of us skilled in studying parasite-induced changes are therefore well placed to bore down into the proximate mechanisms now that reduced prices have democratized the recent technological advances. However, our partners need to be well chosen. As I entered the arena I discovered a great deal of hyperbole and it is no exaggeration to say that studies on transcriptomics (for example) are very much in their development phase, with many artifactual and un-repeatable results being published (see Subramaniam and Hsiao, 2012). It is getting better, but it remains a truth despite the progress. Parasite biologists should be part of the field aiming to understand the genetic basis of phenotypes because the biology involved is intriguing, but we should drive sampling, experimentation and analysis rather than becoming overwhelmed by the formidable bioinformatics. Approaching the problem with such a synergetic willingness is likely to yield the greatest results.


I am grateful to Joanne Webster and Shelly Adamo for inviting me to attend the special meeting that led to this publication and the editors at JEB for their willingness and engagement and in particular Michaela Handel for her deft handling of the logistics. I am grateful to David Biron, Charissa de Bekker and Frederic Thomas for stimulating chats on this topic.


  • Funding

    I am supported by start-up funds from Penn State University.


View Abstract