The natural environment is inherently noisy with acoustic interferences. It is, therefore, beneficial for a species to modify its vocal production to effectively communicate in the presence of interfering noises. Non-human primates have been traditionally considered to possess limited voluntary vocal control, but little is known about their ability to modify vocal behavior when encountering interfering noises. Here we tested the ability of the common marmoset (Callithrix jacchus) to control the initiation of vocalizations and maintain vocal interactions between pairs in an acoustic environment in which the length and predictability (periodic or random aperiodic occurrences) of interfering noise bursts were varied. Despite the presence of interfering noise, the marmosets continued to engage in antiphonal calling behavior. Results showed that the overwhelming majority of calls were initiated during silence gaps even when the length of the silence gap following each noise burst was unpredictable. During the periodic noise conditions, as the length of the silence gap decreased, the latency between the end of noise burst and call onset decreased significantly. In contrast, when presented with aperiodic noise bursts, the marmosets chose to call predominantly during long (4 and 8 s) over short (2 s) silence gaps. In the 8 s periodic noise conditions, a marmoset pair either initiated both calls of an antiphonal exchange within the same silence gap or exchanged calls in two consecutive silence gaps. Our findings provide compelling evidence that common marmosets are capable of modifying their vocal production according to the dynamics of their acoustic environment during vocal communication.
The natural acoustic environment is cluttered with numerous biotic and abiotic sounds. Many taxa use vocal communication to maintain social contact, navigate, select mates and advertise threats. The ability to communicate with conspecifics in a noisy environment is an important social behavior in many animal species. Successful communication, however, is ultimately dependent upon a balance between an individual's impetus to communicate and the variety of constraints that limit its occurrence. The external environment, for example, may impede both the timing and structure of a signal (Marten et al., 1977; Waser and Brown, 1986; Egnor et al., 2007) whereas the animal itself may be restricted by limitations on its ability to control signal production (Cynx, 1990; Miller et al., 2003; Miller et al., 2009a). In vocal communication systems, a key limitation is the acoustic environment itself. Both biotic and abiotic sounds can interfere with the efficacy of the signal's structure (Waser and Brown, 1986). Effective communication, therefore, requires that animals modify their vocal behavior in order to maximize the integrity of vocal interactions. The exact nature of vocal modification likely varies across the taxonomic groups depending on its specific internal and external constraints (Cody and Brown, 1969; Littlejohn and Martin, 1969; Ficken et al., 1974; Wasserman, 1977; Wells, 1977; Gochfeld, 1978; Cade and Otte, 1982; Zelick and Narins, 1983; Grafe, 1996). For non-human primates, a crucial constraint may be the extent of their voluntary vocal control, a component of vocal behavior frequently considered to be impoverished across this taxonomic group (Egnor and Hauser, 2004).
Different species have evolved an array of mechanisms to counter acoustic noise (Brumm and Slabbekoorn, 2005). Some of the mechanisms that have evolved to compensate for environmental noise are: timing the calls to avoid noise, increasing call amplitude (Lombard effect), increasing call length, increasing the repetition rate of calling and changing the call's acoustic frequency. The strategy of waiting for noise to end and timing the calls in the silence intervals has been well characterized in insects (Cade and Otte, 1982), frogs (Zelick and Narins, 1983; Zelick and Narins, 1985) and, to some extent, birds (Brumm, 2006). In non-human primates it has been reported in cotton-top tamarins, Saguisun oedipus (Egnor et al., 2007), but this study was limited to a single silence gap length and recordings from one animal at a time. Other types of vocal modifications have also been studied. For example, if the silent intervals in the acoustic background are short, or if the noise is continuous, animals might increase the call amplitude to counter the masking by noise (Lombard effect); this has been reported in birds (Cynx et al., 1998) as well as in New World primates (Brumm et al., 2004; Egnor and Hauser, 2006). Additionally, animals can increase the redundancy of their calls by increasing the call repetition rate or producing longer calls (Miller et al., 2000; Brumm et al., 2004). In terms of call duration, for instance, the New World primates have been shown to be more flexible than birds (Brumm et al., 2009). Finally, bats have been reported to be able to adjust their echolocating call frequency in order to avoid interference with neighboring conspecifics (Ulanovsky et al., 2004).
Mechanisms for modifying communicative behaviors occur along several time scales. For stable long-term effects, such as a species' acoustic habitat (Waser and Brown, 1986), selective forces may act on the structure of a signal to avoid acoustic interference. In forest-dwelling primates, for example, many species have evolved long-distance calls with characteristics that reduce the effects of degradation inherent to the habitat, such as low fundamental frequency and a redundant multi-pulsed structure (Waser, 1977; Waser and Brown, 1986). For unpredictable environmental perturbations, a different suite of controls is necessary, one that provides animals with sufficient plasticity to adjust their vocal behaviors in response to these events. Consider the following example. A group of monkeys traveling through the canopy comes into contact with a large group of highly vocal birds. Because the monkeys need to continue their vocal interactions to maintain group cohesion, they must develop a strategy for communicating effectively with each other in the current context. One strategy might be to only produce their own vocalizations during the intermittent periods of silence between the birds' calls. This would require an ability to monitor their environment and control parameters of their vocal production, such as the timing of call onset relative to feedback about events in the external environment.
Voluntary vocal control in non-human primates is widely believed to be limited, but this dogma may not be entirely accurate (Egnor and Hauser, 2004). Much of the evidence citing the lack of control over vocal production comes from ontogenetic vocal learning of signal structure, an area in which primates are not particularly adept (Egnor and Hauser, 2004). Control over the behaviors surrounding vocal production, however, appears to be far more sophisticated (Miller et al., 2009b). A few studies have suggested that non-human primates can modify certain elements of vocal behavior in response to change in social context or environmental perturbation (Mitani and Gros-Louis, 1998; Miller et al., 2003; Brumm et al., 2004; Egnor and Hauser, 2006; Egnor et al., 2007). However, the dynamics and limitations of the particular vocal parameter require further analysis via additional experiments.
The aim of the present study was to test whether a highly vocal non-human primate species – the common marmoset, Callithrix jacchus (Linnaeus 1758) – could modify its vocal behavior in acoustic environments with different temporal patterns of noise interference. Common marmosets have been shown to exhibit a rich vocal repertoire in both captivity (Epple, 1968, Pistorio et al., 2006) and their natural habitats (Bezerra and Souto, 2008). By systematically varying the predictability and periodicity of the noise patterns, we tested the marmosets' own ability to control vocal production. Simultaneously recording from pairs of marmosets also enabled us to study their vocal behavior modifications with respect to the antiphonal calling latency. This would provide further insight into their ability to cooperatively deal with a dynamic acoustic interference. Previous studies have focused on single subjects vocalizing in isolation whereas in this study two marmosets communicated using phee calls in the presence of interfering noise. Phee calls are single or multi-phrase contact calls. These are long and tonal calls with a single-phrase duration ranging from 0.5 to 2 s and a fundamental frequency varying between 6 and 10 kHz. Effectively navigating these experimental scenarios requires that marmosets balance their own intrinsic impetus to communicate with the constraints of the environment. Accomplishing this task would ultimately rest on their ability to ascertain the specific pattern of the interference signal and adjust the timing of their calls accordingly. The resultant vocal behaviors, therefore, provide crucial insight into the parametric range of the vocal control and sensory feedback mechanisms of a non-human primate.
MATERIALS AND METHODS
The subjects used in this study included eight adult common marmosets (four male–female pairs from four different social groups) housed in a captive colony at the Johns Hopkins University School of Medicine. All subjects were housed in social groups with their pair-bonded mates and up to two generations of offspring. The subjects were maintained on a diet consisting of a combination of monkey chow, fruit and yogurt, and had ad libitum access to water. Experiments were conducted over a period of two months between the hours of 08:00 and 18:00 h. All experimental procedures were approved by the Johns Hopkins University Animal Care and Use Committee and were in compliance with the guidelines of the National Institutes of Health.
General experiment setup
Two subjects were used in each experiment. Each subject was transported from the colony to the recording room in an opaque transport cage and placed in an experiment cage. The experimental (wire mesh) cages measured 60×30×30 cm. The experimental cages were separated by 3 m with opaque curtains positioned equidistance between them. The acoustic recordings were made within a sound-attenuating recording room. One loudspeaker (Cambridge Soundworks, M80, North Andover, MA, USA) was placed 0.5 m behind each cage and one directional microphone (Sennheiser, ME66, Old Lyme, CT, USA) was placed 0.5 m in front of each cage. A more detailed description of the experimental setup is provided in a previous publication (Miller and Wang, 2006). To control for variance in antiphonal calling due to the social relationship of the callers (Miller and Wang, 2006), the same two subjects comprising each pair were used for each experimental condition throughout the study. Specifically, we tested the pair-bonded cagemates (one male/one female) housed in the same colony cage in each experiment condition. Each experiment session lasted 30 min during which white noise pulse trains were played at 75 dB SPL (measured 1 m from each speaker). The same white noise pulses were played to the two subjects from two loudspeakers. Noise levels were measure using a Brüel & Kjær (Type 2250, Nærum, Denmark) sound level meter with a ½ inch prepolarized free field microphone (Type 4189). The levels were measured using the LAeq setting (equivalent continuous level).
We examined the individual vocalizations of the marmosets under two different controlled noise environments: periodic and aperiodic. The two experimental environments differed in the extent of the predictability of the noise–silence sequence. Prior to the subject's participation in the controlled noise environments, two 30 min baseline sessions were recorded with no noise broadcast. In each of the controlled noise environments, we presented subjects with a combination of white noise bursts and silent gaps of different duration: 2, 4 or 8 s. The durations were selected based on previous studies of marmoset communication and vocal behavior (Miller and Wang, 2006; Miller et al., 2009b; Miller et al., 2010). The periodic noises consisted of a periodic sequence of white noise bursts. Each noise burst was followed by a silence gap of the same duration (Fig. 1A). Three noise durations were tested: 2, 4 and 8 s. When visually isolated from conspecifics, the predominant vocalization produced by marmosets is their species-typical contact call known as a phee (Miller and Wang, 2006). The typical phee call consists of two pulses with a mean duration of 3.4 s (Miller et al., 2010). The 2 s silence gap is shorter than the mean phee call length (combined multi-phrase length), making it difficult to produce this call type within this time window. Although a normal phee could be produced during a 4 s silence gap, an antiphonal call (reply to an initiator's call) interaction between a pair of marmosets would not be possible. Antiphonal calling is a natural vocal behavior involving the reciprocal exchange of long-distance contact calls between conspecifics (Miller and Wang, 2006). Marmosets perceive phee calls produced within 1–9 s of their own phees to be antiphonal calls (Miller and Wang, 2006). Therefore, in order to complete an antiphonal call sequence in the 4 s of silence, the prospective antiphonal caller would have to produce that call in the silence gap immediately after the 4 s noise subsequent to the initiating call. Overlapping calls in an antiphonal sequence are extremely rare (Miller and Wang, 2006) and were not observed in the baseline sessions of this study. The 8 s silent gap presents an interesting problem for the marmosets, as an antiphonal call sequence could be accomplished using one of two strategies: either the two callers complete the sequence within the 8 s silent gap, or the antiphonal call would have to be produced immediately after the 8 s noise.
The aperiodic noises consisted of two test conditions: predictable and unpredictable. The predictable noises consisted of a random sequence of noise bursts with duration of 2, 4 or 8 s, such that each noise burst was followed by a silence gap of the same duration (Fig. 1B). In other words, the duration of the noise burst predicted the duration of the subsequent silence gap. The unpredictable noises consisted of a random sequence of noise bursts with duration of 2, 4 or 8 s, but the silent gap was never equal to the preceding noise pulse length (Fig. 1C). For instance, an 8 s noise pulse could be followed by a 2 or 4 s silent gap but not an 8 s silent gap. Additionally, in the aperiodic unpredictable case, the total duration of silence and noise was fixed in all the sessions to 778 s and 1012 s, respectively. In contrast to the predictable condition, the duration of the noise did not predict the duration of the subsequent silence gap. Each pair of subjects was tested with two sessions of each of the noise conditions (periodic 2, 4 and 8 s, aperiodic predictable and unpredictable). The order of the sessions and test conditions was randomized and counterbalanced across the pairs.
A custom MATLAB program (MathWorks, Natick, MA, USA) controlled the broadcast of the noise pulse trains and simultaneously recorded the vocalizations produced by the pair of subjects. The program continuously recorded the audio data using a PC-based sound card sampling each of the two channels at 44,100 Hz. This MATLAB program also enabled online monitoring of the vocalizations. The recorded vocalizations were analyzed offline using a combination of custom MATLAB programs, Adobe Audition 3.0 (Adobe Systems, San Jose, CA, USA) and Raven 1.3 software (Cornell Lab of Ornithology, Ithaca, NY, USA). The audio recordings from the individual experiment sessions were de-noised using the ‘noise-reduction’ tool of Adobe Audition 3.0. In order to aide in de-noising, the frozen white noise pulses were recorded before the start of every recording session. First, the frequency profile of the noise pulses was created and that profile was used to de-noise the entire recorded session. We used a noise reduction factor of 60 dB and a 4096 point fast Fourier transform (FFT). In order to test the accuracy of the de-noising method, we performed a test wherein the experimental setup was replicated with 75 dB SPL white noise broadcasted using a speaker and overlapping baseline calls were played using another speaker. We ensured that the relative spacing of the two speakers and the call playback amplitude matched those of the experiment. After de-noising we were able to detect 99.07% of the phee call phrases (535/540). Only ∼2.4% of the phee call phrases had an onset or call offset error of more than 200 ms. Fig. 1D shows the amplitude and spectrogram of a soft phee call (mean power of 30 dB) before and after de-noising using Adobe Audition's noise reduction process. The three-phrase phee call is clearly detectable after de-noising the noise-overlapped waveform. supplementary material Fig. S1A–C shows the spectrogram of six phee calls covering the observed average power range and the accuracy of the de-noising method. The vast majority (>92%) of the calls produced by eight marmosets during baseline sessions had average power greater than 30 dB SPL (supplementary material Fig. S1D). For the purpose of this experiment we focused on the onset, offset time and durations of the marmoset vocalizations. The individual phee calls were first detected using a band-limited energy detector (Raven 1.3). This process detected most calls within the recording session. The recording session was then manually scanned for missed calls and existing detections were corrected if errors in the selections were noticed. Spectrograms were viewed with a 1024 point FFT, 10 ms Hann window with 50% overlap and a frequency range of 22,050 Hz.
Statistical analysis of latencies and percentage of calls initiated under different noise conditions were initially conducted with repeated-measures ANOVA (IBM SPSS Statistics 19, Armonk, NY, USA). Repeated-measures ANOVA were reported only if the data met the conditions of normality (Shapiro–Wilk test), equal variance (Levene's test) and sphericity (Maulchy's test). If the assumptions of normality and equal variance were not met, then non-parametric methods (Wilcoxon rank sum) were used. If the ANOVA showed significant differences in means, then the particular experimental parameters were further tested either with a paired t-test or repeated-measures ANOVA between all possible combinations of condition pairs. The significance level (α) is stated for each t-test and ANOVA. In the case of multiple comparisons, we used the Bonferroni correction where each individual paired comparison was tested with a Bonferroni-adjusted significance threshold of (α/n), where n is the number of comparisons.
Marmosets initiated most calls during silence gaps between interfering noise bursts
We recorded 3880 marmoset phee calls from eight marmosets under the baseline and different experimental conditions. One of the main observations was that the marmosets showed a tendency to vocalize during silent gaps between noise bursts when they engaged in the antiphonal calling behavior. This indicated an ability to control the timing of vocal initiation. Table 1 shows the percentage of calls initiated during silence gaps between noise bursts for both periodic and aperiodic noise conditions. Overall, approximately 90% (mean ± s.d.=91.7±6.3%) of the phee calls were initiated during silence gaps between noise bursts. Repeated-measures ANOVA showed significant differences in the percentage of calls initiated in silence across the different noise conditions (F4,28=20.993, P<0.001; Mauchly's W=0.091, P=0.182). Under the periodic conditions, the percentage of calls initiated during silence gaps increased from 79% for the 2 s noise condition to 90% for the 4 s noise condition and finally to 97% for the 8 s noise condition (Table 1). This difference of percent calls initiated in silence was statistically significant between the 2 s vs 8 s and 4 s vs 8 s cases as shown in Table 2. Under the aperiodic conditions, the marmosets initiated approximately 95% (predictable) and 96% (unpredictable) of their calls in silence, which is significantly higher than the 79% initiated during the periodic 2 s case (2 s vs predictable: t7=5.126, P<0.001; 2 s vs unpredictable: t7=5.429, P<0.001; Table 2). The higher percentage of calls initiated in silence during aperioidic compared with the periodic 2 s conditions could be because, in the aperioidic case, the marmosets could avoid the 2 s silent gaps and chose to vocalize in the longer silence gaps instead. However, in the periodic 2 s case this approach would not succeed.
In the subsequent analysis, we compared the percentage of calls initiated within a 2, 4 or 8 s silent gap across the periodic, predictable and unpredictable conditions. Fig. 2A shows that in the periodic case the marmosets were equally likely to call within any of the three silent gaps. Comparison of the percentage of calls made across the different periodic test conditions showed no significant difference between the 2, 4 and 8 s cases (F2,14=2.347, P=0.132; Mauchly's W=0.52, P=0.141). However, this was not the case in the aperiodic noise conditions. Repeated-measures ANOVA of the mean percentage of calls per session showed a significant difference between the 2, 4, and 8 s silent gaps (predictable: F2,14=55.501, P<0.001; Mauchly's W=0.948, P=0.851; unpredictable: F2,14=12.935, P=0.001; Mauchly's W=0.873, P=0.666) for the predictable and unpredictable cases as shown in Fig. 2B,C, respectively. Approximately 75% of the calls initiated in the aperiodic conditions across all subjects were initiated in either an 8 s or a 4 s silent gap. Furthermore, in the predictable condition the marmosets vocalized in the 8 s silent gap significantly more than in the unpredictable condition (t7=3.176, P<0.05). Hence, in a dynamic environment like the aperiodic case, they avoided the 2 s silent gap and used the 4 and 8 s silent gaps more frequently to initiate their calls. Furthermore, the marmosets used the longest silence gap (8 s) when the pattern of noise–silence was predictable. For this analysis we only considered the first call within any given silence gap.
Latency between noise offset and call onset varies depending on noise conditions
We analyzed the latency between the noise offset and the call onset (hereafter referred to as latency; see Fig. 1B for an illustration) for the phee calls made during the different noise conditions. This was a crucial parameter to gain insight into potential vocal behavior modifications that the marmosets used to avoid noise bursts when vocalizing. Fig. 3A shows the latency data from the five different noise conditions. The repeated-measures ANOVA showed significant differences in the latencies during the different noise conditions (F4,28=39.44, P<0.001; Mauchly's W=0.133, P=0.303). In the periodic conditions, the marmosets initiated their calls significantly sooner when noise gaps were shorter. The mean latencies for the periodic 2, 4 and 8 s conditions were 1.54, 2.6 and 4.78 s, respectively. The mean latencies in the two aperiodic conditions (predictable: 3.4 s; unpredictable: 3.18 s), however, were longer than the periodic 2 s condition, but shorter than the periodic 8 s condition (Fig. 3A). Interestingly, the mean latency in the aperiodic predictable case was between that of the periodic 4 s and periodic 8 s cases. Table 3 shows the results of an individual paired t-test between the latencies for the different noise conditions.
In addition, we quantified the variation of the latency under the different noise conditions by computing the coefficient of variation [CV=(s.d. of latency)/(mean latency)] as shown in Fig. 3B. We tested the CV between different noise conditions using a paired t-test (Table 4). There were two basic trends in the data. First, the mean CV increased from 0.27 to 0.31 to 0.43 in the periodic 2, 4 and 8 s cases, respectively. Second, there was no significant difference in the CV between the periodic 8 s, predictable and unpredictable conditions. Fig. 3C,D presents further analysis of the latency data within the aperiodic predictable and aperiodic unpredictable conditions, respectively. Latency progressively increased in the aperiodic predictable conditions when the silent gap (same as preceding noise length) increased from 2 to 8 s (2 s: 1.34 s, 4 s: 2.57 s, 8 s: 3.98 s; Fig. 3C). The aperiodic unpredictable case presented the marmosets with the most challenging noise environment. To test whether the marmosets were in any way using the noise pulse length to determine the call onset we looked at the latencies with respect to the previous noise pulse length and found significant differences between them (F2,14=13.623, P<0.001; Mauchly's W=0.417, P=0.072). As shown in Fig. 3D, the mean latency did not significantly vary for calls initiated after a 2 s and 4 s long noise pulse (mean latency=4 s) but was significantly lower following an 8 s noise pulse (mean latency=2.3 s).
These observations suggest that during the aperiodic noise conditions, the marmosets were trying to predict the duration of the silent gap from the duration of the preceding noise pulse. Recall that in the aperiodic unpredictable case the marmosets initiated over 96% (Table 1) of their calls in the silent gaps. This percentage of calls initiated in silence is much higher than would be expected if the calls were to occur in silence by chance (43.2% given that total silent period is 778 s within an 1800 s session). As plotted in Fig. 3D, inhibiting vocal production for approximately 4 s before initiating a call gives the subjects a lower probability of being interrupted by a noise pulse immediately following a 2 s and a 4 s noise. However, the latency reduced to 2.3 s following an 8 s noise, which was never followed by an 8 s silent gap.
Effects of interfering noises on antiphonal calling latency
We analyzed the effect of the noise conditions on the antiphonal calling interactions, focusing our analyses on the antiphonal call latency, i.e. the time delay from the end of caller 1 and beginning of caller 2, of the four marmoset pairs. Fig. 4A shows antiphonal interactions between caller 1 and caller 2 within the same 8 s silence whereas Fig. 4B shows interactions between consecutive silent gaps. The former case (Fig. 4A) required greater accuracy in call timing in order for the marmosets to initiate antiphonal calls within the same silent gap and avoid the noise at the same time.
In Fig. 4C we plot the antiphonal call delays for all eight subjects during the periodic 8 s and the baseline conditions divided into non-overlapping 2 s bins. The periodic 8 s condition significantly altered the antiphonal call delay distribution. There was a significant increase in the percentage of antiphonal call exchange with delays less than 2 s and a reduction in the percentage of calls with delays in the range of 4–8 s (Table 5). Additionally, there was also a significant increase in the percent calls with delays in the range of 10–14 s. This implied that in the presence of noise the marmosets are either shortening the antiphonal delay to fit their calls within the same silent interval or exchanging calls with one noise pulse in between, effectively extending the latency period beyond what is considered to be typical for the species (Miller et al., 2009b; Miller and Wang, 2006).
Fig. 5 shows additional antiphonal calling group data from all eight marmosets. The mean ± s.e.m. of the antiphonal call delays in the noise conditions and baseline are plotted in Fig. 5A. In general, the antiphonal call delays were longer in the noise conditions than in the baseline condition. This increase was statistically significant in the periodic 4 s and 8 s, aperiodic predictable and unpredictable cases (F5,35=12.923, P<0.001; Mauchly's W=0.059, P=0.469). We also observed that the calling rate of the marmosets was significantly higher (F5,35=38.79, P<0.001; Mauchly's W=0.013, P=0.107) in the periodic 2 s, periodic 4 s, periodic 8 s and predictable conditions (>2.5 calls min–1) compared with the baseline condition (1.3 calls min–1; Fig. 5B). This suggested that, although the white noise broadcast is an acoustic interference, it did not result in an overall drop in the calling rate of the marmosets. Another consequence of the noise on the antiphonal delay was that approximately 10% of interactive calls by caller 1 and caller 2 overlapped in time (t7=3, P<0.05; Fig. 5C). It is important to note that overlapping calls were not observed in the baseline conditions in this study and were rarely observed in a previous study of antiphonal calling by marmosets (Miller and Wang, 2006). Fig. 5C also shows the percentage of antiphonal calls with latencies less than 10 s as well as the percentage of call exchanges with latencies greater than 10 s in the noise and baseline. It is important to note that the overall percent of antiphonal calls with latencies less than 10 s decreased in the noise condition compared with the baseline (t7=4.86, P<0.05), as one would expect given the lengths of noise pulses. Finally, a higher percentage of overlapped calls were primarily seen in the periodic 4 s and aperiodic noise conditions (Fig. 5D). Together, these effects suggest that introducing patterns of noise caused the animals to significantly alter their species-typical vocal behavior. Table 6 shows the normality and equal variance test results for data plotted in Fig. 5A–C.
Many animal species have been known to change their call acoustics in the presence of noise. Alternatively, they could change their vocal behavior to minimize the interruption by acoustic interference. Individuals able to modify elements of their vocal behavior in order to maintain the efficacy of communication would be at a significant advantage. Studies in frogs and birds have shown the ability of these animals to avoid overlapping their calls and song, respectively, with environmental sounds. Zelick and Narins (Zelick and Narins, 1985) showed that treefrogs Eleutherodactylus coqui were able to adjust the inter-call interval of advertisement calls relative to the playback periodic tone bursts as well as the pseudorandom tone sequence. Similarly, work in nightingales Luscinia megarhynchos (Brumm, 2006) demonstrated their ability to modify the song onset time, for example to avoid overlap with songs from other bird species. Previous work in non-human primates showed that single cotton-top tamarins avoided initiating vocalizations in the presence of noise (Egnor et al., 2007). Building on this earlier study, in the present study we systematically varied the periodicity and predictability of the noise interference patterns while a pair of marmosets engaged in active bouts of antiphonal calling. As shown in earlier work (Egnor et al., 2007), the marmosets in our study avoided initiating a vocalization while noise was being broadcast. Furthermore, we observed that they significantly shortened the latency of call onset as the silent gap shortened from 8 to 2 s in the periodic sessions.
There are, however, some interesting differences between the latencies observed in our study and those reported in the treefrog (Zelick and Narins, 1985) and the nightingale (Brumm, 2006). The chorusing treefrogs E. coqui produced their advertisement calls within 750 ms of silent gaps and the nightingales initiated their song approximately 880 ms after the end of heterospecific song. There are several possible reasons for this difference in the observed latency in the treefrog and nightingale as compared with the marmoset. First, the experimental settings in which the calls were recorded were vastly different, from the field recordings of the treefrogs (Zelick and Narins, 1985) to the heterospecific song playback in the nightingale (Brumm, 2006) to the antiphonal call exchange between two marmosets within a controlled acoustic environment (present study). As noted by Zelick and Narins (Zelick and Narins, 1985), they could not have accounted for the influence of the neighboring animals on the studied individual calling behavior. In the case of Brumm (Brumm, 2006), the calling behavior of nightingales (one at a time) was analyzed in response to song playback of other bird species. Second, the advertisement call of the treefrog and bird song are both closely linked to mate selection and establishing dominance within a group. The successful communication of these calls has a direct implication on the fitness of the individual. For instance, in the presence of hundreds of chorusing treefrogs it would be advantageous to initiate the advertisement call as soon as silence is detected. In contrast, pairs of marmosets exchange contact calls (phees) that are used to maintain group cohesion within a controlled acoustic environment. Third, the latency in our case is probably influenced by the particular strategy that the marmosets employed to deal with the ongoing noise patterns. In general, if the marmosets waited 2 s before initiating a call, then the chance of overlap with the noise would be significantly reduced. Fourth, the latency may reflect the time required to plan a phee call structure, which, unlike the stereotypic advertisement call and bird song, can be single- or multi-phrased with variable-length phrases (Miller et al., 2009b). Zelick and Narins (Zelick and Narins, 1985) also hypothesize that the treefrog calling behavior is driven by an internal call oscillator with typical inter-call intervals of 2 to 3 s whereas the marmoset phee calls inter-call intervals are variable and not periodic in nature. The higher latency in the marmoset phee call initiation could suggest a more elaborate neural control mechanism of vocal production to account for the greater variability in their call structure.
More complex behaviors were evident in the marmosets as well, particularly when the degree of predictability about the acoustic environment became more challenging. At least two general approaches for communicating in the presence of interfering noise conditions were observed. During the periodic conditions, subjects were able to ascertain the reliability of the noise and silence gaps. Hence, marmosets decreased the latency of call onset as the silent gap reduced from 8 s to 4 s to 2 s. For the less reliable aperiodic conditions, such an approach would be difficult. Here the marmosets essentially avoided the 2 s intervals, calling almost exclusively during the 4 s and 8 s silent intervals. Not surprisingly, this preference for the 8 s silent interval was notably higher in the aperiodic predictable case than the unpredictable case. The aperiodic: unpredictable case also presented the most challenging noise environment, where an 8 s noise pulse (N) is followed by either a 2 s or 4 s silent gap (S) but not an 8 s silent gap and so on (example sequence – 2N4S4N8S8N2S2N8S8N4S). The data shown in Fig. 3D suggest that the marmosets employed a clever strategy to deal with the pseudorandom sequence of noise and silence. Calling with a significantly shorted latency following an 8 s noise pulse (2.3±0.15 s) when compared with a 4 s or 2 s noise pulse gave the marmosets a high probability of avoiding overlap with the succeeding noise pulse. In contrast, a 4 s noise pulse is followed by either a 2 s or 8 s silent gap. In this case, the latency (3.9±0.4 s) ensured that they avoided the short 2 s gaps and called in the longer 8 s gaps. Finally, a 2 s noise is followed by either a 4 s or an 8 s silent gap and the latency (4±0.25 s) ensured avoidance of noise overlap at the call onset. There wasn't a significant difference between latencies following a 2 s and 4 s noise pulse (Fig. 3D). This could result from the fact that in both of these cases, waiting for 4 s before initiating a call would ensure no overlap with the noise. The result from the unpredictable condition supports an interpretation that marmosets in this study appeared to predict the silent gap length from preceding noise pulse. Their predictions were less reliable in the unpredictable condition. Nevertheless, their call latency data show that they did not call randomly even in the ‘partially’ unpredictable condition.
Perhaps the most significant observation in these experiments, however, is that the marmoset pairs maintained communication despite the noise. Interestingly, we found that the overall rates of calling increased during the noise conditions relative to baseline. This effect could result from either a general arousal increase due to the presence of an interfering stimulus, or because the animals judged a decrease in the efficacy of signal transmission and increased their effort to compensate. If the former were true, we would expect to see a decrease in volubility towards baseline levels as subjects gained more experience with the noise. As there was no evidence of such a trend, the more likely explanation is that subjects' natural impetus to communicate was disrupted by the presence of noise and compensated by increasing their frequency of vocal production.
Antiphonal calling is a naturally occurring vocal behavior in marmosets that is characterized by the reciprocal exchange of their phee calls when visually occluded from conspecifics (Miller and Wang, 2006). In order to maintain antiphonal calling exchanges, subjects needed to adapt the dynamics of this vocal behavior to the constraints of the environment. A clear example was demonstrated during the periodic 8 s condition. Here subjects either completed an antiphonal call exchange within a single silent gap or across a noise presentation. In both cases, an antiphonal call exchange occurs and continues the conspecific communication. We found that there was a significantly higher percentage of antiphonal call exchange with short latency (0–2 s) in the periodic 8 s case. This implies that the marmosets cooperated in order to complete an antiphonal exchange within the same silent interval. Additionally, the percentage of antiphonal exchanges with latencies of 2–10 s was lower than baseline and that found previously (Miller and Wang, 2006). We also found evidence that the experimental noise conditions impeded normal antiphonal calling behavior. Although overlapping calls were never observed during the baseline, and are rare overall (Miller and Wang, 2006), they were evident in all of the noise conditions, the most prevalent being during the aperiodic experiments (Fig. 5D).
The question of vocal control in non-human primates has been controversial. Although the extent and range of control over signal structure appears limited (Egnor and Hauser, 2006; Egnor et al., 2006; Egnor et al., 2007), flexibility in vocal behavior may be more extensive. It would be a misrepresentation to classify members of this taxonomic group as entirely lacking voluntary control over vocal production. Each of the behavioral modifications employed here, for example, require mechanisms for control over elements of communication, such as timing, occurrence and frequency of vocal production, rather than over the structure of the signal itself. Moreover, these abilities are utilized in concurrence between the conspecifics, such that two individuals combine efforts of control in order to maintain communicative integrity during antiphonal calling. As evidenced by an increased length of antiphonal calling bouts between cagemates relative to non-cagemates, and as measured by the number of consecutive antiphonal calls (Miller and Wang, 2006), this natural vocal behavior appears highly cooperative. In order to vocalize during the 8 s periodic condition to produce antiphonal exchanges within a single silent gap, for example, callers not only controlled the timing of their calls but also cooperated to do so.
This study indicates that common marmosets possess a level of sophistication in control over vocal production that has not been previously reported in non-human primates. These highly vocal non-human primates were clearly able to adjust their vocal output in order to avoid acoustic interference and cooperatively maintain effective communication. This suggests that common marmosets are an excellent model for continued study of the behavioral and neural mechanisms underlying vocal control and auditory feedback. Future behavioral and neurophysiologic experiments will build on these results to elucidate the underpinnings of this complex aspect of primate vocal control and communication.
We wish to thank Jennifer Papac for assistance in running the experiments and Jenny Estes for help with animal care.
Supplementary material available online at http://jeb.biologists.org/lookup/suppl/doi:10.1242/jeb.056101/-/DC1
This research is supported by the National Institutes of Health [grant numbers DC005808 and DC008578 to X.W. and DC009007 to C.T.M.]. Deposited in PMC for release after 12 months.
- © 2011.