Auditory feedback is critical for the development and maintenance of speech in humans. In contrast, studies of nonhuman primate vocal production generally report that subjects show little reliance on auditory input. We examined the extent to which cotton-top tamarin (Saguinus oedipus) vocal production is sensitive to perturbation of auditory feedback by manipulating the predictability of presentation of a 1 s burst of white noise during the production of the species-specific contact call, the combination long call (CLC). We used three experimental conditions: the Begin condition, in which white noise was presented only during the first half of a recording session, the End condition, in which white noise was presented only in the last half, and the Random condition, in which each call had a 50% probability of receiving white noise playback throughout the recording session, making the auditory feedback unpredictable. In addition we recorded calls before and after the experimental series (Baseline condition) to determine whether any changes induced by modification of auditory feedback persisted. Results showed that playback of white noise during the production of the CLC produced changes in the temporal structure of the CLC: calls were shorter and had fewer pulses, indicating that modification of auditory feedback can interrupt vocal production. In addition, calls that received modified feedback were louder and had longer inter-pulse intervals than those that did not, consistent with an adaptive response to the masking effect of white noise playback. The magnitude of this compensatory effect and the interruption rate were both sensitive to whether the feedback modification occurred at the beginning or end of the experimental session: early feedback produced less interruption and more compensation. Finally, when auditory feedback modification was unpredictable, adaptive changes were observed in both calls that received modified feedback and those that received normal feedback, suggesting that tamarins can generate an expectation of noise playback and increase vocal amplitude in anticipation of masking.
Vocal plasticity and the maintenance of stable vocal structure has been shown to depend on auditory feedback in humans and songbirds. Experimental manipulations that modify the auditory feedback of a vocal utterance, such as delaying the feedback (Cynx and von Rad, 2001; Lee, 1950; Leonardo and Konishi, 1999), masking the feedback (Leonardo, 2004; Rivers and Rastatter, 1985), replacing the feedback (Houde and Jordan, 1998; Villacorta et al., 2004) or eliminating it altogether (Konishi, 1965; Nordeen and Nordeen, 1992; Waldstein, 1990), all produce significant changes in the spectrotemporal structure of vocalizations in both humans and songbirds.
To date, there have been few direct experimental tests of the auditory-vocal feedback loop in nonhuman primates. A small number of experiments have reported that deafened animals can develop (Hammerschmidt et al., 2001; Winter et al., 1973) and maintain (Talmage-Riggs et al., 1972) species-typical vocal signals, suggesting that auditory feedback is not critical for the structure of nonhuman primate vocal signals. On the other hand, nonhuman primates have been shown to modulate vocal amplitude in response to changes in background noise amplitude. This phenomenon, known as the Lombard effect, has been observed in macaques (Sinnott et al., 1975), common marmosets (Brumm et al., 2004) and cotton-top tamarins (Egnor and Hauser, in press) and is consistent with some contribution of auditory feedback to vocal control. Additional changes have also been induced in cotton-top tamarin vocalizations using an interruption paradigm (Miller et al., 2003), in which auditory or visual stimuli are presented during vocal production. This method has previously been used to demonstrate sensitivity of the vocal system to auditory feedback in both songbirds (Cynx, 1990; Heymann and Bergmann, 1988; Hultsch and Todt, 1982) and non-songbirds (ten Cate and Ballintijn, 1996). The logic of this method is that if animals have some degree of vocal control, then when a competing auditory event is detected the caller should either arrest call production or modify vocal output to avoid acoustic interference. Miller and colleagues (Miller et al., 2003) showed that presentation of white noise bursts during the production of the cotton-top tamarin's combination long call (CLC; a multiple-pulse contact call, see Fig. 1A) caused a reduction in mean pulse number and an increase in the duration of the inter-pulse interval.
The current experiment was designed to address three issues raised by the initial observation of auditory-feedback-dependent alteration of CLC structure in cotton-top tamarins, as well as the more general capacity of vocal control in primates. (1) Are the changes in CLC structure restricted to the time domain (call duration and inter-pulse-interval duration) or are there also amplitude and frequency changes? (2) Do the changes induced by modification of auditory feedback persist after normal feedback is restored? (3) Does the predictability of the feedback modification control the number or degree of changes induced in the targeted call?
To examine the extent to which the production of the tamarin's CLC can be perturbed by real-time alteration of auditory feedback we built a computer-controlled stimulus presentation system. In contrast to the manual delivery procedure used by Miller et al, the system employed in the following experiments automatically detected the production of a CLC and then delivered an auditory stimulus at a defined delay after call onset. Subjects received three different experimental conditions. In all cases the stimulus consisted of a one second white noise burst triggered by the subject's spontaneous production of a CLC. Because this playback interferes with the auditory feedback that subjects normally hear during vocal production, we refer to this as modified feedback. In the Begin condition, subjects received modified feedback for the first half of an experimental session, followed by normal feedback for the second half. In the End condition subjects experienced normal auditory feedback for the first half of the session, and modified feedback for the second half. This experimental design allows us to determine whether there are any persistent effects of feedback alteration. If there are no persistent effects of modifying feedback, then vocal behavior in the Begin- and End-modified feedback presentations should be identical, and similarly, vocal behavior in the two normal feedback presentations should be identical. Alternatively, significant differences between modified or normal feedback would indicate that recent auditory experience can modify vocal behavior. Finally, in the Random condition each detected CLC received, at random, either modified feedback (white noise) or normal feedback (no noise). This experiment allowed us to determine whether feedback consistency determines the degree of change to CLC structure.
Materials and methods
We tested four male (subjects DW, PJ, JM and SP) and four female (subjects JG, SH, RB and KW) adult cotton-top tamarins (Saguinus oedipus L.) from the Harvard University Cognitive Evolution Laboratory, ranging in age from 5 to 13 years. All subjects were born in captivity and socially housed, with separate home cages for each breeding pair and their offspring. Subjects were maintained on a diet of marmoset chow, sunflower seeds, peanuts, yogurt and fruit; small pieces of raisin or marshmallow were used to lure subjects out of their home cages and into the test chamber. Subjects had ad libitum access to water. All subjects were familiar with the experimental apparatus, and were involved in a concurrent experiment on the effect of male and female whistle playback during vocal production.
We recorded vocalizations from individual tamarins inside a double-walled sound-attenuating chamber (Industrial Acoustics, New York, New York, USA) using a directional microphone (ME-66, Sennheiser, Old Lyme, CT, USA). Recorded signals were amplified (1202-VLZPro, Mackie, Woodinville, WA, USA), and digitized (sampling rate: 24 kHz, precision: 16-bit). White noise playback was amplified (RA-100, Alesis, Cumberland, RI, USA) and presented over a speaker (10 cm mid-range, Radioshack, Cambridge, MA, USA). Data acquisition and sound presentation was controlled with custom-built software (MATLAB; The Mathworks, Natick, MA, USA) and an A:D,D:A board (RP2, Tucker-Davis Technologies, Alachua, FL, USA). Subjects were monitored with a video camera during the recording sessions.
General experimental design
Subjects were lured out of their home cage and into a transport box, and moved to the experimental chamber where they were lured out of the transport box and into the playback cage. The playback cage was 25 cm deep × 28 cm wide × 51 cm tall with a wire mesh front and smooth, opaque Plexiglas™ top, bottom and sides. The tamarins spent most of the time perched on the wire mesh and facing the microphone when vocalizing. An experimental session lasted 10 min and each subject experienced only one condition per session and only one session per day.
We collected spontaneously produced calls before the entire experimental series (`Initial baseline') and after (`Final baseline'). Initial baseline calls were the ten calls in our colony call database recorded closest to the beginning of the experimental series. Initial baseline calls were recorded an average of 2.9 (range: 2-5) months before the first day of testing across individuals. Final baseline calls were the first ten calls recorded after the experimental series for all subjects (average 1.3 months, range 1-2 months).
The stimulus presentation and data collection program monitored the input from the microphone. At the beginning of each session the speaker was calibrated to be flat (±2 dB) from 800 to 10 000 Hz (for details see Egnor and Hauser, in press). Threshold detection was used to detect the onset of a CLC. The thresholds were individually tailored to each subject to minimize feedback presentation in response to cage noise or chirps (the other common vocalization produced by an isolated cotton-top tamarin) while still detecting each CLC. When a CLC was detected, the feedback stimulus was presented and a 20 s record following the detection event was saved directly to a file. The experimenter then examined a spectrogram of the trial, and classified the trial as either a CLC or an error (cage noise or a chirp).
Begin and End condition
Stimuli were 10 independently generated 1 s long white noise bursts, presented in random order at an intended feedback delay of 0.5 s at 70 dB sound pressure level (SPL). In the Begin condition, subjects received noise playback (e.g. modified feedback) on every CLC detected in the first 5 min of the experimental session and no noise playback (e.g. normal auditory feedback) during the last 5 min of the session. In the End condition, subjects received no noise playback during the first 5 min of the session and noise playback on every CLC detected during the last 5 min of the session. Five of the eight subjects received Begin sessions in a block first, and then End sessions, and the remaining three received End sessions first and then Begin sessions. Subjects remained in a condition until they had produced at least 30 calls that received modified feedback and at least 30 calls that did not. Because spontaneous call rate varied from subject to subject this meant that each subject participated in a different number of sessions (Begin mean: 8, range: 4-13; End mean: 8.25, range 5-13). Differences in the amplitude envelopes of our subjects' CLCs caused the software program to trigger at different times during calling, and thus stimulus onset was not exactly 0.5 s. The measured modified feedback delay values for the Begin condition ranged from 0.55-1.53 s, with a mean value of 0.79±0.2 s. The measured modified feedback delays for the End condition ranged from 0.54-1.52 s, with a mean value of 0.81±0.2 s. The number of trials in each condition is given in Table 1.
Stimuli were eight independently generated 1 s long white noise bursts. Stimuli were presented at an amplitude of 70 dB SPL, with an intended delay of 0.5 s. The experienced modified feedback delay in the Random condition ranged from 0.58-1.47 s, with a mean value of 0.82±0.21 s. Each time a CLC was detected during the session the data collection system randomly assigned either noise playback (modified auditory feedback) or no noise playback (normal auditory feedback) with a probability of 50%.
The signal recorded on the microphone is the sum of the vocal response and the playback presented over the speaker (Fig. 1B). In order to accurately characterize the vocalization, it is critical to remove the playback signal. This procedure is described in detail in Egnor and Hauser (Egnor and Hauser, 2006). Briefly, we used Golay codes to measure the impulse response of the playback apparatus after each trial and used this impulse response and a copy of the signal sent to the speaker to generate an estimate of the playback signal on the microphone. This estimate was then subtracted from the raw microphone signal, leaving a clean copy of the tamarin's vocalization (Fig. 1C).
Determining what constitutes an interrupted call is difficult, given that Baseline calls have a variable number of pulses and variable durations. Previous researchers (Miller et al., 2003) calculated the mean number of whistles in the absence of modified feedback, and defined as interrupted any calls that had fewer whistles than this mean. We observed, however, that even in the Baseline condition the calls of all of our subjects had a variable number of pulses (see Table 2). We therefore used the following approach: a call was defined as interrupted if it was more than two standard deviations shorter than the average Baseline call duration for that subject. The logic behind this was that in a normal distribution approximately 95% of values will fall within plus or minus two standard deviations from the mean. Therefore a call shorter than two standard deviations below the mean has only a 2.5% probability of belonging to the uninterrupted distribution.
After denoising, an automatic analysis program detected the beginning and end points of each pulse in each recorded CLC. These points were verified and, when necessary, corrected by the experimenters and then used to calculate the duration, fundamental frequency and amplitude of the pulses, the duration of the inter-pulse intervals (IPIs), and the total call amplitude and duration for each CLC. Recordings where movement artifact obscured the call were excluded from analysis. Stimulus delays were measured manually from the oscillogram of each call as the distance between the onset of the call and the onset of modified feedback. All statistical comparisons were made initially with repeated-measures multifactorial ANOVAs (SPSS Inc., Chicago, IL, USA), with Huynh-Feldt corrections for violations of sphericity, when necessary. Significant interaction effects and main effects with more than two levels were tested (Statistica, StatSoft, Tulsa, OK, USA) with either Tukey's honestly significant difference test if the assumptions of sphericity were met or Bonferroni's procedure if they were not (Hochberg and Tamhane, 1987; Maxwell, 1980).
Baseline data analysis
To determine whether there were any long-lasting effects of the experimental conditions, for each subject we compared 10 calls recorded before, and 10 calls recorded after, all three experimental conditions had been completed. Calls recorded before and after the experimental manipulation did not differ significantly in any of the measured parameters: call amplitude (F1,7=0.41, P=0.54), call duration (F1,7=3.03, P=0.13), inter-pulse interval (F1,7=1.10, P=0.33), pulse fundamental frequency (F1,7=0.83, P=0.39), pulse duration (F1,7=0.27, P=0.62), or number of pulses (F1,7=0.03, P=0.86) see Table 2. Because there were no significant differences in any of the measured variables, we pooled these data and all subsequent comparisons to Baseline data were made using this pooled data set.
Calls that received modified feedback were significantly shorter than those that did not (F2,14=40.19, P=0.0004; Fig. 2). There was no significant effect of experimental condition (F1,7=2.58, P=0.11), but there was a significant interaction between experimental condition and feedback type (F2,14=4.44, P=0.03). Post-hoc analyses revealed that End condition modified feedback calls were significantly shorter than Begin condition modified feedback calls (P=0.04). Random condition modified feedback calls were intermediate in duration between Begin and End and not significantly different from either (P=0.12 and P=0.98, respectively). None of the normal feedback calls were significantly different in duration (Begin versus End, P=1.0, Begin vs Random, P=0.10, End vs Random, P=0.16).
Number of pulses
Calls that received modified feedback had significantly fewer pulses than calls that did not (F1,7=35.9, P=0.001). There was a significant effect of experimental condition on average pulse number (F2,14=4.92, P=0.02). Post hoc analysis showed that calls in the Begin condition had significantly more pulses than calls in Random (P=0.03), whereas calls in End were not significantly different from either (Random: P=0.91 and Begin: P=0.06). There was no significant interaction between experimental condition and feedback type (F2,14=0.56, P=0.58).
Interruption rate as a function of stimulus condition
The mean proportion of interrupted calls was significantly different across stimulus conditions (F2,14=3.89, P=0.05; Fig. 3). Subjects interrupted their calls in response to feedback approximately equally in the Random and End conditions (35.8% vs 38.5%), and less in the Begin condition (20.1%). Post hoc tests revealed no significant differences between pairs of conditions, although there was a trend towards a significant difference between Begin and End (P=0.06).
Interruption rate as a function of time
For each subject and each experimental condition we separated the first half of the sessions (`early sessions') and the last half (`later sessions') and compared interruption rate. Tamarins interrupted their calls an average of 34.7% of the time in early sessions, and 29.0% of the time in later sessions (Fig. 3), this decrease was not significant (F1,7=3.48, P=0.10). We performed a similar analysis within sessions, grouping calls that occurred in the first half of the modified feedback interval together and those that occurred in the last half. Calls were also equally likely to be interrupted at the beginning and end of an experimental session across all three experimental conditions (F1,7=0.03, P=0.86; Fig. 4).
There are at least two possible responses when auditory feedback is modified: (1) calls can be interrupted and (2) call structure can be adjusted to compensate for the disruption in auditory feedback. To evaluate whether changes to call structure had occurred we compared calls that received modified feedback, but had not been interrupted, to calls that did not receive modified feedback in the experimental session (normal feedback calls) and also to calls that were produced spontaneously before and after the experimental session (Baseline calls). All modified feedback comparisons in this section are therefore values for uninterrupted modified feedback calls only.
Mean call amplitude was significantly higher for calls that received modified feedback than for calls that did not (F1,7=21.82, P=0.002). In addition, mean call amplitude was significantly different across the three experimental conditions (Begin, End and Random, F2,14=7.71, P=0.006; see Table 3). Post hoc analysis showed that calls in the Random condition were significantly louder than calls in both Begin (P=0.01) and End (P=0.01) conditions.
To examine the effect of experimental condition, and to determine whether vocal amplitude had increased uniformly throughout the call, or whether the amplitude increase was restricted to the portion of the call that had received modified feedback, we performed a more detailed analysis of vocal amplitude. Although the intended white noise playback delay was 0.5 s for all subjects and all experimental conditions, the exact playback times varied from subject to subject because of differences in call structure (as described above in the Materials and methods section). For each individual subject, for each modified feedback call, we calculated the exact time in the call at which the white noise playback occurred. Using these values we then calculated an average white noise playback time for each subject, for each experimental condition. This allowed us to divide each call into two segments: before playback and during playback. We then calculated average amplitude values for each segment. We analyzed Baseline calls in the same manner, using the average feedback time for all three experimental conditions for each individual. There was a significant effect of experimental condition (F6,42=3.3, P=0.02), a significant effect of time relative to feedback (F1,7=213.8, P=0.000002), and a significant interaction effect (F6,42=3.85, P=0.004; see Fig. 5). Post hoc analyses showed that there were no significant differences in call amplitude before noise playback in any condition. However, during noise playback, Begin modified feedback calls were significantly louder than both Begin normal feedback (P=0.0002) and Baseline (P=0.03) calls (see Fig. 5A). In addition, Random condition modified feedback and Random condition normal feedback calls were both significantly louder during feedback than Baseline (P=0.0003 and P=0.00001, respectively), but not significantly different from each other (P=1.0; Fig. 5C). Finally, there was no significant difference in call amplitude during playback between End modified feedback and normal feedback calls (P=1.0), or between either and Baseline (P=1.0; see Fig. 5B).
Inter-pulse intervals (IPIs) were significantly longer in calls that received modified feedback than in those that did not (F1,7=7.99, P=0.03; see Table 3). There was no effect of experimental condition (F2,14=2.46, P=0.12) and no interaction (F2,14=0.509, P=0.61). Average modified feedback IPI was 13% greater than in Baseline in the Random condition, 10% greater in the Begin condition and 8% greater in the End condition. Average normal feedback IPI values were 11% greater than in Baseline in the Random condition, but only 4% greater in Begin and 2% greater in End. However, there were no statistically significant differences in IPI between any of the experimental conditions and Baseline (F3.6,25.1=1.96, P=0.14).
Chirps and whistles, the two primary pulse types of the CLC, differ in duration. We therefore calculated duration values separately for each pulse type. Chirps were an average of 166 ms in Begin, 163 ms in End, 158 ms long in Random, and 174 ms in Baseline. Whistles were an average of 690 ms in Begin, 696 ms in End, 820 ms in Random, and 640 ms in Baseline (see Table 4). There were no significant differences in chirp durations between any of the conditions (F6,42=0.95, P=0.47). However, whistle duration was significantly different across conditions (F2.4,16.6=3.79, P=0.04). Post hoc analysis showed that whistles in both the Random modified feedback and Random normal feedback conditions were significantly longer than in the Baseline condition (P=0.02 and P=0.04, respectively; see Table 4).
Pulse fundamental frequency
Because chirps and whistles also differ in frequency, we separated pulses into chirps and whistles and then calculated average fundamental frequencies for each experimental condition and for the Baseline condition (see Table 4). Whistles had an average fundamental frequency of 1966 Hz in the Begin condition, 1942 Hz in the End condition, 1880 Hz in the Random condition, and 1976 Hz in the Baseline condition, whereas chirps had an average fundamental frequency of 2658 Hz in the Begin condition, 2661 Hz in the End condition, 2080 Hz in the Random condition, and 3142 Hz in the Baseline condition. There was no difference in chirp fundamental frequency between any of the conditions, including Baseline (F1.7,11.9=2.82, P=0.11), nor was there any difference in whistle fundamental frequencies (F2.2,15.2=2.79, P=0.09).
Effect of time within session
One possible source of variation between Begin and End calls is that, by definition, Begin modified feedback calls occur in the first 5 min of the session, whereas End modified feedback calls occur in the last 5 min. If there are consistent changes either in the structure of the CLC or in the sensitivity of calls to feedback modification over the course of the recording session, then differences between Begin and End might be simply due to the fact that they occurred at different times within the session, rather than being the result of differences in feedback history. We tested this possibility in two ways. First, to see whether sensitivity to feedback modification varied over the course of the session we divided Random condition modified feedback calls into early calls (calls that received modified feedback in the first 5 min) and late calls (calls that received modified feedback in the last 5 min) and compared duration values. There was no significant difference in call duration between early and late calls (F1,7=1.5, P=0.26). Second, to see whether call amplitude changed consistently over the course of a recording session, we compared the first and last calls produced in each Baseline session. There was no significant difference in call amplitude between first and last Baseline calls (F1,7=2.03, P=0.20). This suggests that duration or amplitude differences observed between Begin and End are due to differences in feedback history, rather than being simply the result of time within the session.
The above experiment was designed to address the general gap in our understanding of nonhuman primate vocal control, and more specifically, three aspects of auditory-feedback-mediated vocal control in cotton-top tamarins. The first was a finer-grained description of the nature of vocal changes induced by perturbation of auditory feedback, the second an investigation into the time course of those changes, and the third an examination of whether the predictability of the auditory feedback perturbation controlled the extent of the changes induced. We found that playback of white noise during production of the CLC produced changes in the temporal structure of the CLC: calls were shorter and had fewer pulses, indicating that perturbation of auditory feedback can interrupt the production of a vocal signal. In addition, calls that received modified feedback were louder and had longer inter-pulse intervals than calls that received normal feedback, consistent with an adaptive response to the masking effect of white noise playback. The magnitude of this compensatory effect and the interruption rate were both sensitive to whether the feedback perturbation occurred at the beginning or end of the experimental session. Finally, when auditory feedback modification was unpredictable, adaptive changes were observed in both calls that received modified feedback and those that received normal feedback, suggesting that tamarins are generating an expectation of noise playback and increasing vocal amplitude in anticipation of masking.
Calls that received modified feedback were shorter than calls that received normal feedback and contained fewer pulses, confirming the observation (Miller et al., 2003) that cotton-top tamarin CLCs can be interrupted by an auditory stimulus. However, it was not the case that a call that received modified feedback was either interrupted or not. If that were the case, we would expect a bimodal distribution of call durations - short interrupted calls and long uninterrupted calls. Calls that received modified feedback had more variable durations (Fig. 2), but the distributions were not bimodal. That calls that received feedback were shorter demonstrates that altering auditory feedback altered vocal output. However, the stochastic nature of the modification does not fit a simple model of auditory feedback control over vocal production in which the presence of feedback perturbation either does or does not cause an immediate truncation of the call.
Adaptive responses to white noise feedback
Calls that received modified feedback tended to be louder than calls that did not. This is not completely unexpected, as increasing the amplitude of a vocal signal is a common mechanism for mitigating the masking effects of background noise in a variety of animals, both those that learn their vocalizations, and those that do not[humans (Lombard, 1911); zebra finches (Cynx et al., 1998); nightingales (Brumm and Todt, 2002); Japanese quail (Potash, 1972a); cats (Nonaka et al., 1997); Beluga whales (Scheifele et al., 2005); macaques (Macaca nemestrina and M. fascicularis) (Sinnott et al., 1975); common marmosets (Brumm et al., 2004) and cotton-top tamarins (Egnor and Hauser, 2006)]. In addition, calls that received modified feedback also had longer IPIs than those that did not, an increase also observed by Miller and colleagues (Miller et al. 2003). An increase in the duration of pauses between words has been shown to increase the intelligibility of speech (Picheny et al., 1986). This suggests that increasing inter-pulse intervals may be a way of increasing CLC intelligibility in the face of a masking stimulus. Finally, whistles from calls recorded in the Random condition were significantly longer than those recorded in the Baseline condition. In the case of a white noise masker, potential adaptive responses (i.e. vocal changes that would increase the effective signal-to-noise ratio) include an increase in vocal amplitude (Lombard, 1911), an increase in the duration of vocal elements (Brumm et al., 2004; Foote et al., 2004; Fricke, 1970; Van Summers et al., 1988), an increase in the number of vocal elements (Lengagne et al., 1999; Potash, 1972b) and an increase in the duration of pauses between vocal elements (Picheny et al., 1986). In the present experiments we found evidence for increases in amplitude, pulse duration and inter-pulse-interval duration, suggesting that tamarins are capable of adaptive modification of vocal output in response to an interfering auditory stimulus.
Effect of history and predictability on interruption rate
Interruption rate was not constant across the three experiments. Calls in the End condition were more likely to be interrupted than those in Begin, and as a consequence, were shorter. The Begin and End stimulus conditions differed only in when the modified feedback occurred: in the Begin condition, subjects received modified feedback at the beginning of the session and normal feedback at the end, whereas the reverse was true for the End condition. If interruption rate increased as a function of time within the session, this alone might account for the increased interruption rate in the End condition. However, interruption rate did not vary significantly over the course of a session in any of the conditions (Fig. 3). This suggests that the difference in modified feedback call durations between Begin and End is due to the local difference in feedback history. There are two possible explanations for this observation: (1) the abrupt onset of noise playback in the middle of a session is more disruptive to vocal behavior or (2) playback that begins as soon as the subject is placed in the apparatus is less disruptive. The interruption rate in the Random condition, in which modified feedback occurred at unpredictable times throughout the session, was the same as in the End condition. This observation is consistent with the second interpretation, that the interruption rate is lower in the Begin condition because interruption that begins as soon as the subject is placed in the apparatus is less disruptive to vocal behavior. Perhaps when a subject is moved from one environment to another (e.g. from the homeroom to the testing chamber), he evaluates the new location and generates some expectation about the new location, including the new acoustic environment. In the case of the Begin condition, modified feedback commences immediately and therefore would be included in the subject's expectation for the acoustic environment. By contrast, in the End condition the subject's expectation will be for silence and the onset of white noise playback might, therefore, be more startling. In the Random condition, white noise playback occurs at unpredictable intervals, which might also be more disruptive than consistent playback that commences immediately. The current data are not sufficient to determine the exact effect of changes in local feedback history and predictability, but it is clear that they can both influence the interruption rate. Ongoing experiments in our laboratory are aimed at examining in more detail the effects of predictability on acoustically mediated vocal control.
Effect of history and predictability on amplitude compensation
The most unexpected result was the observation that in the Random condition, not only were calls that received modified feedback significantly louder than Baseline calls (as expected in adaptive response to the masking white noise), but calls that did not receive modified feedback were also significantly louder than Baseline calls. This behavior was only observed in the Random condition. In the Begin and End conditions, normal feedback call amplitude was indistinguishable from that observed in Baseline. What might account for this difference? In the Random condition noise playback occurred unpredictably. As a result, it was not possible for a subject to anticipate whether they would receive playback until after the call was initiated. In the Begin and End conditions, by contrast, noise occurred in predictable intervals. Subjects reduced call amplitude during normal feedback in the Begin and End condition, showing that tamarins are able to detect when feedback modification is unlikely and respond appropriately. However, when feedback modification was unpredictable, in the Random condition, both modified and normal feedback calls were louder. This suggests that adaptive amplitude compensation is not necessarily instantaneous. That is, tamarins are not necessarily detecting noise during a call and then immediately increasing vocal amplitude in response. If this were the case, we would expect normal feedback calls in the Random condition to be the same amplitude as Baseline calls. Instead our results suggest that in the Random condition tamarins are generating an expectation of noise playback and increasing vocal amplitude in anticipation of masking. An alternative to this interpretation is that the vocal control mechanism that compensates for an increase in the amplitude of environmental noise simply has a slow time constant. Based on this account, an amplitude increase induced by modified auditory feedback persists for a short time, and therefore a subsequent normal feedback call would also be louder. A similar type of vocal compensation aftereffect has been observed in human modified feedback experiments in the frequency domain (Donath et al., 2002; Houde and Jordan, 1998; Jones and Munhall, 2000). However, the fact that only the portion of the CLC that received modified auditory feedback is louder argues against this simpler explanation. The fact that the amplitude increase is restricted to the portion of the call that received modified auditory feedback also argues against the difference being due to a simple increase in arousal in the Random condition.
Because data collection for the Random condition was completed in all subjects before the Begin and End conditions, there is an additional potential explanation: the reduction in call amplitude for normal feedback calls observed in both the Begin and End conditions may be due to the subjects' prior experience with feedback modification during the Random condition, rather than the difference in feedback predictability. Alternatively, both possibilities may be correct: it may be the case that when feedback is uncertain, subjects are more likely to increase call amplitude in all calls (even those that do not receive modified feedback), and also that with experience subjects learn to restrict their adaptive response to only calls that actually receive modified feedback. We are currently following up on this result with experiments that vary both the degree of feedback predictability and the experimental history.
Effect of feedback history
Although modified feedback calls in both the Begin and End experimental conditions were louder than calls recorded during Baseline, they were only significantly louder in the Begin condition. There are several possible reasons for this observation. One possibility is that call amplitude drops over the course of a session; later calls are simply quieter. We believe this is unlikely because we found no difference in call amplitude in Baseline recordings between calls recorded at the beginning and end of the session. Another possibility is that because call rate declines over the course of an experimental session, there are fewer calls in the End modified feedback condition, and therefore fewer instances in which adaptation could be observed. In addition, End modified feedback calls were also much more likely to be interrupted than Begin modified feedback calls, further reducing the number of calls in which adaptation could be observed. A final possibility is that the local history of feedback influences both how much interruption occurs and whether or not the subject compensates for modified feedback.
Interruption rate relative to other studies
The previous interruption experiment in cotton-top tamarins (Miller et al., 2003) found interruption rates of 25-28% in response to a one second white noise burst. We found a range of interruption rates across the three experimental conditions, from 20% in the Begin condition, to 39% in the End condition, and 36% in the Random condition. The interruption rate measured by Miller et al. was therefore intermediate between the value we obtained in the Begin condition and those in the End and Random conditions. There are several differences between these two studies. In the Miller et al. study, (1) noise was presented manually, (2) noise was presented with 100% probability throughout the recording session, (3) interruption was defined based on the number of whistles and (4) some of the CLCs targeted with white noise were elicited by playback of conspecific CLCs. Despite these differences, the interruption rate is still relatively similar between experiments.
In experiments with birds using light flashes rather than noise bursts to interrupt vocal production, there was a large difference in interruption rate between birds that learn their vocalizations[zebra finches Taeniopygia guttata (Cynx, 1990); nightingales Luscinia megarhyncho (Riebel and Todt, 1997)] and those that do not[(collared doves Steptopelia decaocto (ten Cate and Ballintijn, 1996)]. The interruption rate was 71% in zebra finches and in 57% in nightingales, much higher than the 20% observed in collared doves.
Putting these comparative data together, tamarins are capable of interruption rates that are higher than the non-vocal learning doves, but lower than the vocal learning nightingales and zebra finches. Though there are significant methodological differences between these studies that should be resolved in future comparative analyses, we can derive two interim conclusions from these comparisons. First, though tamarins, like other nonhuman primates, appear much more closely aligned with the Sub-Oscine, non-vocal learners in that they lack the capacity for vocal imitation, their capacity for acoustically mediated vocal interruption is closer to the range of the vocal learners. Second, to establish the degree of vocal control in tamarins and other species, it will be important to assess how different types of feedback alter not only the rates of interruption, but the form of vocal modification in the presence of feedback. Under some conditions, animals may interrupt at high rates and in other conditions, they may continue to call, but modify call structure in such a way that they maximize transmission in the face of environmental perturbations.
Stability of CLC structure over time
The fact that calls recorded before and after the experimental series were not significantly different in pulse number, pulse duration, call amplitude, fundamental frequency or inter-pulse-interval duration suggests two things: first, in the absence of perturbation, call structure is stable over the course of a year, and second, that although feedback modification can change call structure in the short term, these changes are not permanent. The observation of call structure stability is consistent with a study in common marmosets (Callithrix jacchus) that showed that the spectrotemporal structure of the analogous contact call, the phee call, is stable over the course of a year (Jones et al., 1993). These results stand in contrast to changes in phee call structure observed within individuals by Jorgensen and French (Jorgensen and French, 1998) in another Callitrichid, Wied's black-tufted marmosets (Callithrix kuhli). However, an important difference between the studies is that Jorgensen and French recorded contact calls in a natural social setting, whereas in both our study and that of Jones and colleagues, calls were recorded in isolation, which might minimize call structure modification due to changes in social context.
Accumulating evidence of call convergence (the convergence of the acoustic features of a call to a shared structure) within social groups suggests some degree of vocal plasticity in nonhuman primates (Fischer et al., 1998; Gouzoules and Gouzoules, 1990; Mitani et al., 1992) (reviewed by Egnor and Hauser, 2004). This conclusion is still controversial; many investigators argue that the observed convergence may be the result of shared motivational states, shared genetics, shared environment or the selection of a specific call from within an innately determined repertoire (Janik and Slater, 2000; Lieblich et al., 1980; Mitani et al., 1992).
If the call convergence observed in nonhuman primates is the result of auditory-feedback-dependent vocal plasticity, rather than some other mechanism, then there must be some means by which changes in auditory feedback produce changes in vocal structure. If this interpretation is correct, then nonhuman primate vocal production should be susceptible to perturbations in auditory feedback. Here, by selectively modifying the statistics of auditory feedback, we show that adult cotton-top tamarins modify call structure in the presence of acoustic perturbation. Based on the current results, we suggest that tamarins not only have more fine-grained control over vocal output than previously expected, but that they can use information about the nature of feedback, including its structure and predictability in time, to adaptively modify the structure of their own calls. These results set the stage for neurobiological studies aimed at understanding the nature of the feedback loop that connects acoustic perception with vocal production, both within and across species.
The authors would like to thank past and present members of the Hauser lab for their assistance on this project, in particular Jeff Stevens, Alison Shell, Matt Kamen, Jeanette Wickelgren, Jonathan Matus, Keena Seyfarth and Meredith Loth. We would like to thank Cory Miller, David Feinberg and Asif Ghazanfar for comments on earlier versions of this manuscript and Anthony Leonardo for kindly sharing his Golay code MATLAB scripts.
- © The Company of Biologists Limited 2006