Welcome to our new website

Foraging Blainville's beaked whales (Mesoplodon densirostris) produce distinct click types matched to different phases of echolocation
M. Johnson, P. T. Madsen, W. M. X. Zimmer, N. Aguilar de Soto, P. L. Tyack


Blainville's beaked whales (Mesoplodon densirostris Blainville) echolocate for prey during deep foraging dives. Here we use acoustic tags to demonstrate that these whales, in contrast to other toothed whales studied, produce two distinct types of click sounds during different phases in biosonar-based foraging. Search clicks are emitted during foraging dives with inter-click intervals typically between 0.2 and 0.4 s. They have the distinctive form of an FM upsweep (modulation rate of about 110 kHz ms-1) with a -10 dB bandwidth from 26 to 51 kHz and a pulse length of 270 μs, somewhat similar to chirp signals in bats and Cuvier's beaked whales (Ziphius cavirostris Cuvier), but quite different from clicks of other toothed whales studied. In comparison, the buzz clicks, produced in short bursts during the final stage of prey capture, are short (105 μs) transients with no FM structure and a -10 dB bandwidth from 25 to 80 kHz or higher. Buzz clicks have properties similar to clicks reported from large delphinids and hold the potential for higher temporal resolution than the FM clicks. It is suggested that the two click types are adapted to the separate problems of target detection and classification versus capture of low target strength prey in a cluttered acoustic environment.


Toothed whales and bats have both evolved biosonar systems to locate prey and acquire information about their surroundings (Griffin, 1958; Norris et al., 1961; Au, 1993). The performance of such systems is ultimately limited by ambient noise, volume reverberation and clutter from non-prey targets (Urick, 1983), and these factors have shaped both the properties of the transmitted signals and the processing performed by the auditory receiver. Echolocation signals used by bats and dolphins are predominantly ultrasonic (Au, 1993) as pulses with short wavelengths provide efficient backscatter from small targets (Medwin and Clay, 1998) and can be generated in directional sound beams to reduce clutter, reverberation and eavesdropping. Within this outline, the >800 species of echolocating bats have evolved a plethora of signals, adapted to the habitat, prey and size of the bat (Fenton, 1984; Fenton, 1995), which can be broadly classified as either frequency modulated (FM) or constant frequency (CF) (Simmons and Chen, 1989). While CF pulses are produced by some Dopplersensitive species to detect moving targets (Schnitzler, 1973; Schnitzler et al., 2003), the evolutionary advantage of FM signals is debated. Frequency modulation may enhance Doppler tolerance (Kroszczynski, 1969; Altes and Titlebaum, 1970), increase bandwidth for target classification (Schmidt, 1992) or, in combination with a postulated matched filter in the receiver, increase ranging accuracy (Strother 1961; Cahlander et al., 1964; Simmons, 1973). However, FM signals are highly variable among and within species (Fenton 1995), and no single selection pressure is likely to have formed these signals in bats (Boonman and Schnitzler, 2005). Plasticity in sound generation is linked both to habitat and to different phases of prey location and capture: many aerial hunting bats produce long duration signals while searching for prey but decrease the duration and increase the sweep and repetition rates when approaching a selected prey item (Kalko and Schnitzler, 1993; Kalko, 1995; Siemers et al., 2001; Schnitzler et al., 2003).

Reports on echolocating toothed whales suggest that, while these species produce biosonar signals that differ significantly from those of bats, there is much less variation in the signals both within and among toothed whale species than is the case for bats (Au, 1997). With the exception of beaked whales, toothed whale biosonar clicks reported to date can be broadly classified as either short (<150μ s) broadband transients, produced by most delphinids (Au, 1993), or longer-duration narrow-band high-frequency (NB-HF) clicks, produced by some species of small toothed whales (Madsen et al., 2005a). The low-frequency, multi-pulsed clicks of sperm whales form a third category (Møhl et al., 2003). Consistent frequency modulation has not been observed in any of these clicks (Au, 1993). For NB-HF clicks, little or no variation has been reported in duration or frequency content during echolocation tasks (Au et al., 1999; Madsen et al., 2005a). In contrast, some delphinids exhibit a degree of flexibility in sound production, which is exploited in a context-dependent way. For these species, clicks with greater source level also have increased bandwidth and center frequency (Au et al., 1995): dolphins clicking in a reverberant environment produce clicks with lower source levels, narrower bandwidths and lower center frequencies (Au, 1993), while clicks with higher source levels and wider bandwidths are produced when the echo-to-noiseratio (ENR) is poor (Au et al., 1974; Au et al., 1985). Although a similar relationship between output level and spectra has been observed in delphinids in the wild (Au et al., 2004; Madsen et al., 2004), it is not known if these changes are linked to different phases of foraging. The only documented changes in toothed whale clicks during echolocation for prey relate to the repetition rate and the source level of the sonar signals. Like bats, toothed whales terminate a prey capture with a buzz composed of clicks with lower source level and higher repetition rate (Miller et al., 1995; Madsen et al., 2002; Miller et al., 2004), but the characteristics of individual clicks have not been reported to change when the buzz is initiated.

Beaked whales (fam. Ziphiidae) comprise some 20 species that are among the most unknown of toothed whales, both in terms of life history and their biosonar signals. Beaked whales are elusive deep-divers that inhabit oceanic habitats and forage on a variety of pelagic and bentho-pelagic fish, crustacea and cephalopods (Mead, 1989). A series of mass-strandings of beaked whales in conjunction with the use of military sonars (Simmonds and Lopez-Jurado 1991; Frantzis, 1998; Evans and England, 2001; Fernández et al., 2005) has prompted studies of their diving behavior and use of sound (Cox et al., 2006). Acoustic recording tags (DTAGs) were attached to Blainville's (Mesoplodon densirostris) and Cuvier's (Ziphius cavirostris) beaked whales in 2003, resulting in the first description of their peculiar echolocation clicks (Johnson et al., 2004). Longduration (ca. 250 μs) clicks with center frequencies of 30-40 kHz were reported for both species although the upper frequency limit of these recordings was restricted by the 96 kHz sampling rate of the tags. Using two individuals, each tagged with a DTAG, to record each other's vocalizations, the source properties of the clicks of Cuvier's beaked whales were derived and shown to be unique among toothed whale sonar signals in having an FM structure (Zimmer et al., 2005).

Surprisingly, echoes from objects in the water, excited by clicks from the tagged whales, were clearly audible in the tag recordings from both beaked whale species, enabling the first investigation of echolocation in a free-ranging foraging odontocete (Madsen et al., 2005b). Comparing the outgoing clicks with their corresponding echoes, it was demonstrated that a Blainville's beaked whale, in contrast to bats and dolphins (Simmons et al., 1979; Rasmussen et al., 2002; Au and Benoit-Bird, 2003), did not appear to adjust the output level and production rate of clicks when approaching prey. Instead, it switched directly to a buzz when prey were within about a body length of the whale (Madsen et al., 2005b). The absence of rate adjustment in the outgoing signal may indicate that the whale seeks to maintain a broad auditory scene for as long as possible while approaching a prey item. The switch-over to rapid clicking in the buzz then represents the point at which maintenance of the auditory scene is abandoned in favor of frequent positional updates needed to capture the selected prey. If this explanation is correct, the regular and buzz clicks serve very different functions and some specialization in the characteristics of these clicks might be expected.

Here we use data from an extended bandwidth (192 kHz sampling-rate) DTAG to study the production, characteristics and use of biosonar signals by foraging Blainville's beaked whales. We present the first conclusive evidence in a toothed whale species of context-dependent echolocation click types with very different properties. We quantify the spectral and temporal characteristics of these signals and of the echoes to which they give rise, and discuss possible production mechanisms. The implications for auditory signal processing and possible adaptations to different echolocation tasks during foraging are discussed in the light of theories and data from echolocating bats and dolphins.

Materials and methods

Acoustic recording tag

The results reported here were obtained using an acoustic and orientation recording tag, the DTAG, which is attached to the whale with suction cups (Johnson and Tyack, 2003). Sensors in the tag include pressure, a 3-axis accelerometer and a 3-axis magnetometer, sampled at 50 Hz per channel. The tag records sound from two 6 mm diameter spherical hydrophones positioned 2.5 cm apart and sampled at 192 kHz per channel. The hydrophone signals are sampled synchronously by sigmadelta analog-to-digital converters (ADCs). The symmetric digital anti-alias filters used in the ADCs ensure a linear phase response (i.e. a constant phase delay) from 1 to 96 kHz that is identical on the two channels. The overall frequency response is flat within 3 dB from 0.5 to 67 kHz with a -10 dB response at 81 kHz due to the anti-alias filters. The tag uses a loss-less compression algorithm (Robinson, 1994) to achieve an audio recording time of 9.5 h with 6.6 GB of memory. Sensor data is recorded for an additional 8 h after the end of audio recording.


An adult Blainville's beaked whale Mesoplodon densirostris Blainville (referred to below as Mesoplodon), swimming in a group of five animals, was tagged near the island of El Hierro in the Canary Islands, Spain, in October, 2004. The tag was positioned on the whale with a 5 m long hand-held pole from a 4 m rigid hull inflatable boat (RHIB). Although placed on the right side of the whale, the tag moved to a position near the dorsal ridge about 1 m posterior of the blowhole after 1 h. Observations of the tagged whale were made from a 7 m RHIB and a high point (120 m altitude) on land with coverage of the study area. The tagged whale was tracked from both RHIB and shore station by means of a VHF radio beacon in the tag. The tag audio recording extended from 09:19 h to 18:50 h, a little before sunset, while sensor data continued to be recorded until 03:42 h the following morning.

Data analysis

Sensor data from the tag were converted to depth, pitch, roll and heading using pre-determined calibration constants and following the method of Johnson and Tyack (Johnson and Tyack, 2003). The tag audio recording was evaluated aurally and by spectrogram to determine the location of the start and end of clicking and other vocal features. The recording contained clicks from the tagged whale and from other whales nearby. As the only cetaceans sighted from the land station and observation vessel within 3 km of the tagged whale were also Mesoplodon, we conclude that clicks from untagged whales in the tag recording are from whales of the same species. Recordings of untagged whales provide an opportunity to measure the far-field waveform of their signals that is not otherwise possible with a tag attached behind the head. Clicks from tagged and untagged whales can be distinguished in two ways. Clicks from the tagged whale have low-frequency energy (below 15 kHz) that is absent in clicks from other whales (Zimmer et al., 2005). This is likely due to sounds associated with click production that propagate within the body, but that radiate poorly into the water. Clicks from the tagged whale can also be distinguished based on their angle of arrival, θ, computed from:θ =sin-1c/d), where c is the speed of sound in seawater, d is the hydrophone separation, and τ is the time delay between the two hydrophone signals, measured by cross-correlation1. Although a single arrival angle is insufficient to characterize the source bearing in three dimensions, here we are only interested in discriminating tagged whale clicks from those produced by other whales. The arrival angle of clicks from the tagged whale, when corrected for the tag orientation on the whale, will be consistently close to zero, while those from other whales will vary widely as the tagged whale maneuvers.

The combination of angle and spectral cues makes it straightforward to distinguish clicks from the tagged whale as well as sequences of clicks from other whales. These sequences often occur in the tag recording with inter-click intervals (ICIs) similar to that of the tagged whale and with slowly varying arrival angles consistent with relative motion of two whales. Thus, we infer that each such sequence emanates from one of the Mesoplodon in the vicinity. The amplitude of the clicks in these sequences will depend upon the distance to the clicking whale, the source level, and the angle between the directional sound beam and the tag (Zimmer et al., 2005). We maintain that much of the amplitude variation within a sequence is due to the third factor (aspect) as the range to the clicking whale will not change much over a period of a few seconds and the source level of clicks is unlikely to vary from click-to-click by more than a few decibels (Au, 1993; Madsen et al., 2005b). On this presumption, the click with maximum amplitude in each sequence will be the closest to representing an on-axis version of the click (Møhl et al., 2003).

Clicks from untagged whales were classified as either regular or buzz clicks, based upon their production rate. The tagged whale rarely produced clicks at intervals of less than 0.1 s except during a buzz and the sharp decrease in ICI at the start of a buzz makes the distinction between click types unambiguous for both tagged and untagged whales.

For each regular click, a similar length section of audio prior to the click was extracted to provide a contemporaneous estimate of the noise level, including both system and ambient noises. Click and noise samples were filtered digitally with a 4-pole Butterworth high-pass filter at 5 kHz to remove low frequency flow noise. For buzz clicks a single noise sample was taken prior to the entire buzz and a 15 kHz high-pass filter was used to enhance the signal-to-noise ratio (SNR) of these lower-level clicks. All samples were then filtered with a 2-pole low-pass filter (pole frequency 80 kHz, Q of 2.5) to partially compensate for the magnitude response of the anti-alias filter. These filtering operations resulted in an overall system response flat to within ±1 dB from 6-80 kHz (18-80 kHz for buzz clicks). The location of the hydrophones on the animal could well give rise to additional variations in the magnitude response. In particular, the location of the hydrophones about 30 mm above the body of the whale, a frequency-dependent sound absorbing and reflecting surface, will lead to some spectral distortion. However, the relatively flat on-axis power spectra of clicks recorded from untagged whales leads us to suspect that such environmental effects are small.

Following the filtering operations, the root-mean-squared (RMS) level and SNR of each click in sequences of regular and buzz clicks were computed. The RMS level was calculated over the 97% energy duration of each click and SNR was estimated by dividing the RMS level of clicks by the RMS level in the preceding noise sample of duration 0.5 ms. A sequence of regular clicks was considered to have a high dynamic range if the RMS level of the strongest click in the sequence was 30 dB or more above that of the weakest click. Following the previous argument, a high dynamic range is indicative of a wide variation in the aspect of the clicking whale during the sequence. If Mesoplodon have a similar beam pattern to delphinids and Cuvier's beaked whales, their on-axis clicks will be at least 30 dB stronger than their weakest off-axis clicks (Au, 1993; Zimmer et al., 2005) so it cannot be concluded that the strongest clicks in every high dynamic range sequence are on-axis. However, given that the weakest clicks will sometimes not be detected at all, sequences with 30 dB or more dynamic range will often contain one or more clicks that are close to on-axis. By selecting the strongest clicks in many such sequences, we expect that the resulting set will contain a preponderance of clicks from close to the acoustic axis and so broadly represent the properties of on-axis clicks. Thus, we denoted clicks with RMS level within 3 dB of the strongest click in each sequence as being probable on-axis clicks if (i) the SNR of each selected click was greater than 30 dB, and (ii) the ratio of the RMS level of the strongest and weakest click in the sequence exceeded 30 dB. Following this method, 225 sequences of regular clicks were attributed to untagged whales, of which 50 had a dynamic range of over 30 dB. A set of 139 clicks were selected from these high dynamic range sequences as probable exemplars of the on-axis waveform. Weaker clicks in the same sequences were selected as off-axis exemplars.

For buzz clicks, similar criteria were adopted although, because of the lower apparent source level of these clicks (Madsen et al., 2005b), the dynamic range and SNR criteria were reduced to 20 dB. As the parameters of buzz clicks were found to vary strongly with received level, only clicks within 1 dB of the strongest click in a buzz sequence were taken as on-axis exemplars. Some 41 sequences of buzz clicks from untagged whales were isolated in the tag recording but only 7 of these met the SNR and dynamic range criteria. A total of 109 presumed on-axis buzz clicks were selected from the 7 sequences.

For each exemplar click from an untagged whale, we measured the 97% energy duration, centroid frequency, -10 dB bandwidth, centralized RMS bandwidth, RMS duration and the Woodward time resolution (sensu Au, 1993) (see Table 1). The bandwidths and centroid frequency were computed using a 1024 point Fourier transform with a rectangular window of 1.4 ms for regular clicks or 0.42 ms for buzz clicks. Frequency modulation in presumed on-axis regular clicks was measured by first computing the phase, φt, of the click over its 95% energy duration using: φt=Im{loge(H{·})}, where H{·} is the Hilbert transform of the click (Oppenheim and Schafer, 1998) and Im{·} denotes the imaginary part. A second-order least-squares fit was then made between t and φt, yielding the starting frequency and modulation rate of the linear chirp that best matched the phase of the click. The fit was rejected (i.e. a linear FM model was considered a poor fit to the click) if any parameter was found to be insignificant at the P=0.05 level (Rice, 1995). Identical procedures were carried out with first and third order (i.e. CF and quadratic chirp) models to verify the suitability of the linear FM model.

View this table:
Table 1.

Parameters of FM and buzz clicks produced by untagged whales and recorded close to the sound axis

Displayed click waveforms (both regular and buzz clicks) were produced using a 5 kHz high-pass filter and the anti-alias compensation filter described above. Envelopes were computed by taking the magnitude of the analytic (i.e. Hilbert-transformed) click. To visualize the multi-pulse structure of off-axis clicks, clicks within a sequence were cross-correlated with a nominal on-axis click, using the analytic signals in each case. Cross-spectra were computed by taking the Fourier transform of the zero-padded cross-correlation. Time-frequency distributions were computed using the Type I Wigner transform (Cohen, 1989) with sequence length of 256 samples.

The tag recordings contain numerous echoes from objects in the water ensonified by clicks from the tagged whale (Johnson et al., 2004). Although these echoes seldom have a high SNR and their characteristics are a function of the target as well as the sound source, they provide an approximation to the on-axis clicks produced by the tagged whale. We isolated strong echoes of both regular and buzz clicks, and compared the spectra and waveforms of these with clicks from untagged whales. To determine that a received pulse was indeed an echo and not a click from an untagged whale, we produced echograms (Johnson et al., 2004) by aligning the envelopes of short sections of audio, synchronized to each tagged whale click. Echoes from distinct targets form sequences of arrivals, evident in the echogram, that have a slowly varying time lag with respect to tagged whale clicks. Echoes with high SNR were selected for comparison with clicks from untagged whales and were processed in the same way as for those signals.


Vocal and diving behavior

A total of four deep foraging dives (maximum depth range 682-1251 m) were performed by the tagged whale during audio recording. Vocalizations occurred at the base of all dives with the predominant sounds being regular clicks and buzzes (Johnson et al., 2004; Tyack et al., 2006). The duration of clicking in each dive varied from 23-33 min and there were 26-38 buzzes per dive with a mean buzz length of 2.9 s (s.d. 1.2, range 0.4-9.8, N=133). The mean ICI of regular clicks was 0.37 (s.d.=0.10), discounting outlier intervals shorter than 0.1 s and longer than 1 s. The profile of a deep dive showing the timing and depth of vocalizations is given in Fig. 1.

Regular clicks

Using the method described above, 139 presumed on-axis clicks were isolated from sequences of regular clicks produced by untagged whales. The parameters of these clicks are summarized in Table 1 and an example click is shown in Fig. 2. All of these clicks have long duration (median of 271 μs), broad bandwidth (median -10 dB BW 24.6 kHz) and contain a distinctive FM upsweep (median modulation rate of 112 kHz ms-1). Echoes of tagged whale clicks show roughly similar form (Fig. 2) but have more spectral variation due, presumably, to the frequency-dependent target reflectivity.

Energy in regular clicks is distributed between -10 dB endpoints of about 26 and 51 kHz with a sharp cut-off below 25 kHz and a more gradual cut-off at the high end. Click spectra reported previously for Mesoplodon (Johnson et al., 2004) are consistent with those in Fig. 2 but were limited to frequencies below 48 kHz. The higher sampling rate used here appears adequate to characterize the entire click spectrum. Both linear and quadratic FM models matched the phase of analytic on-axis clicks well (average residual phase was 12° RMS for linear FM and 7° RMS for quadratic FM). Thus, the Hilbert transform of on-axis Mesoplodon regular clicks can be modeled by a linear FM chirp: Math

where t is the time index, Math, wt is a Gaussian window function and α, β andγ control the initial phase, start frequency and sweep rate of the chirp, respectively. For the signal shown in Fig. 2A, parameter values ofβ =25.3 kHz and γ=126 kHz ms-1 minimized the squared error between the phases of the synthetic and actual signal (phase residual of 9° RMS) while the Gaussian window had a half power (-3 dB) duration of 99μ s. We refer to regular clicks as FM clicks in the remainder of the paper to emphasize this distinguishing trait.

Off-axis FM clicks (judged to be so by their low relative level within a sequence) often appeared to comprise two, and occasionally three, overlapping pulses separated by a variable delay of some tens of microseconds. To visualize these overlapping components within a sequence of clicks, the on-axis click of Fig. 2A was applied as a matched filter and the envelope of the filtered clicks was computed. The filtering operation effectively compressed the part of each click that was similar to the nominal on-axis click making it easier to detect multiple pulses in the envelope. As a graphical aid, the resulting envelopes were classified as multi-pulse if they contained more than one local maximum with level greater than 0.2 of the peak level of the envelope. While the clicks of highest amplitude in the sequences were usually single pulses, the great majority of weaker clicks consisted of multiple pulses as in the example of Fig. 3A. To avoid any possibility that the observed variability in pulse shapes could result from changes in orientation of the receiver (e.g., due to reflections from the body surface), a sequence was chosen for Fig. 3 during which the tagged whale moved very little. The s.d. of pitch, roll and heading were 4°, 7° and 5°, respectively, during the 40 clicks shown. The signal-level-dependent variation in pulse shape, exemplified by Fig. 3 must then result from changes in aspect of the clicking whale.

Fig. 1.

Dive profile of a Blainville's beaked whale foraging dive showing vocal events.

A more sensitive indication of multiple pulses in off-axis clicks can be obtained in the frequency domain. Stacking normalized cross-spectra (i.e. the scaled Fourier transform of the matched filtered clicks) as in Fig. 3C, confirms that the high amplitude clicks tend to have smooth spectra while weaker, and so presumably more off-axis, clicks have highly featured spectra likely due to interference between the pulse components in the click. The relative strength and separation of the pulses in the off-axis clicks vary widely, probably with the aspect of the clicking whale.

Buzz clicks

While FM clicks are made persistently throughout the base of foraging dives, buzz clicks occur in occasional brief bursts and can be readily distinguished from FM clicks both by their ICI and waveform. Following the same technique as for FM clicks, 109 presumed on-axis buzz clicks were identified from untagged whales and the parameters of these clicks are summarized in Table 1. As shown in Fig. 4, on-axis buzz clicks are short transients (median duration 104 μs) with wide bandwidth (median, -10 dB BW, 55 kHz) and no obvious frequency modulation. In fact the buzz click energy may extend beyond 80 kHz, the upper -1dB limit of the compensated tag response, and so the bandwidth and centroid frequency of these transients maybe underestimated in Table 1. The spectrum of on-axis buzz clicks from untagged whales is consistent with that of high SNR echoes from tagged whale buzz clicks and representative examples are given in Fig. 4.

Fig. 2.

(A) Waveform of a regular click from an untagged Blainville's beaked whale, as recorded by the DTAG. (B) Normalized power spectrum of the same click (solid line) and of an echo from a target ensonified by a regular click from the tagged whale (broken line). The dotted line is the system and ambient noise floor. (C) Time-frequency (Wigner) distribution of the click in A.

As compared to FM clicks, presumed on-axis buzz clicks have one half the duration and at least twice the bandwidth. It is interesting to note that there is no overlap in the 5-95% percentiles of any of the parameters listed in Table 1 (with the exception of the lower -10 dB frequency and the time-bandwidth product) for FM and buzz clicks. While the two click types occupy the same frequency band, their characteristics are consistently different. During the change-over from FM clicks to buzz clicks, and vice versa, several clicks of intermediate source level (SL) and ICI appear to be produced but no clicks with intermediate spectra or duration have been recorded, emphasizing the bimodal nature of the sound generation system.

The centroid frequency and bandwidth of buzz clicks were found to vary widely with received level. Although the strongest clicks in each sequence had uniformly short duration, high centroid frequencies and bandwidths, weaker clicks within the same sequence, judged to be off-axis, were longer in duration, showed a notable resonance at 30-35 kHz, and had less energy at higher frequencies (Fig. 4). No obvious multi-pulse structure, like that in FM clicks, was observed in off-axis buzz clicks in the 7 high SNR buzz sequences examined.

Usage of FM and buzz clicks

As demonstrated eleswhere (Madsen et al., 2005b), FM clicks are produced at a variable rate that does not seem to correlate in a consistent way with target range. To explore the adjustment of click rate during buzzes, individual buzz clicks were identified in ten buzzes performed by the tagged whale, revealing the stereotypical ICI pattern shown in Fig. 5. The buzzes chosen were those in which the SNR was sufficient throughout the buzz to detect all clicks. The ICI in these buzzes initially decreases rapidly from 100 ms to about 12 ms and then continues to decrease more slowly, reaching a plateau level of between 3 and 5 ms after about 1.5-2.5 s. Given the terminal ICI of 3-5 ms, the temporal update rate during a buzz could be 50-100 times that during regular clicking if processing occurs on a click-by-click basis.

Madsen et al. also reported that the apparent level of buzz clicks recorded by a tag attached to the clicking whale was some 15 dB less than that of FM clicks, although such near-field and off-axis recordings must be treated with caution (Madsen et al., 2005b). The wider bandwidth data available here yielded a similar result. In 43 buzzes produced by the tagged whale in which the last FM click prior to the buzz and the first buzz click were clearly detectable, a median RMS level difference of 15 dB (range 4-24 dB) was obtained over a 5-75 kHz band using a 95% energy window. A more reliable estimate of the on-axis level difference between FM and buzz clicks may come from examining echoes generated by the two click types. As evident in the example of Fig. 6, the echo level of a single target drops markedly in the transition from FM to buzz clicks. The RMS echo level from the approaching prey target in Fig. 6 reduces by 18 dB (90% energy window, 30-75 kHz band2) at the start of the buzz, a level that is unlikely to be caused by a re-direction of the animal's sonar beam in the 0.23 s between the last FM click and the first buzz click, but rather relates to the reduction in output of the sound generator switching from FM to buzz clicks.

One consequence of the reduced level of buzz clicks is that, while echoes from FM clicks are frequently detectable in the tag recordings with multiple echoes being detected from each click, there is often only one detectable echo during a buzz. For the two foraging dives in which the tag was best placed to record echoes, the number of echo sequences visible in an echogram such as shown in Fig. 6, immediately prior to each buzz, were counted and compared to the number of echoes visible during the buzz. An echo sequence is a set of echoes, one per click, that appear to emanate from a single target, judging by the consistency of the angle of arrival and the approach speed. In the example of Fig. 6, there are three echo sequences just prior to the buzz while only one sequence continues in the buzz. An average of 7.9±5.6 (± s.d., N=63 buzzes) echo sequences were counted prior to each buzz in the two dives using a search interval equal to the length of the buzz. In comparison, an average of 1.3±0.8 echo sequences were visible within buzzes. The intensity of the echoes in these sequences varied throughout the buzz but echoes were often difficult to detect in the early part of the buzz. Only in 29 buzzes was there an echo sequence that was clearly contiguous with a sequence prior to the buzz as in the example of Fig. 6. In these cases, the echo sequence almost always continued throughout the buzz, culminating in a strong echo at a range of about 1 m towards the end of the buzz. Assuming that these continuous echo sequences represent the prey item being approached, we interpret the distance to the target at the time of the last regular click before the buzz as the target proximity at which the whale switches from the search/selection phase to the capture phase of echolocation and this transition is marked by a radical change in echolocation signal. For the buzzes in which it could be measured, this hand-off distance was 3.6±0.6 m (N=29). Buzz length was positively correlated with hand-off distance (N=0.006, N=29) as would reasonably be expected: more distant targets need more time, and clicks, to approach.

Fig. 3.

Variation in magnitude and waveform in a click sequence recorded from an untagged whale. (A) Peak envelope level of each click. Clicks with a single pulse are indicated by open dots; solid dots indicate clicks with two or more pulses. (B) Absolute value of the normalized cross-correlation functions of an on-axis click with two clicks from the sequence (indicated in A by triangles). (C) Normalized cross-spectral magnitude of clicks in a second sequence. Each spectrum is the Fourier transform of a click matched-filtered with an on-axis click. The spectra are displayed in ascending order of peak envelope level, i.e. the strongest clicks in the sequence are shown to the left. The dots on the right-hand side have the same interpretation as for A. Note the smooth cross-spectra of the strong, mono-pulsed clicks while weaker multi-pulsed clicks have more variable spectra.

Fig. 4.

On-axis and off-axis waveforms and spectra of buzz clicks. (A) Waveform of buzz click from an untagged whale judged to be on-axis. (B) Presumed off-axis buzz click from the same sequence (note the change in amplitude scale). (C) Spectra of the on-axis click (solid line), off-axis click (dot-dash line), an echo return from a tagged whale buzz click (broken line), and the noise floor (dotted line).


We have demonstrated that Blainville's beaked whales produce two distinct click types associated with different phases of echolocation-mediated foraging. Long-duration FM clicks are used during the search phase and the initial approach to prey. When the target is about one body length away, the whale switches to buzz clicks. These short duration wide-bandwidth pulses are produced at a high rate throughout the capture attempt. Although changes in clicking rate have been reported for other odontocetes foraging by echolocation, this is the first evidence of a switch in click characteristics linked with different phases of the foraging process.

Fig. 5.

ICI sequence of buzz clicks in ten buzzes produced by the tagged whale. Although the duration of the buzzes varies, the pattern of ICI variation is quite stereotyped.

Our method involved recording signals produced by conspecifics swimming in the vicinity of a tagged whale foraging in its habitat. Studying the echolocation signals made by animals in the wild, as opposed to performing trained tasks in captive settings, has the advantage that the sounds are sampled in the likely context for which they evolved (Madsen et al., 2004). A drawback of this method, however, is that the orientation of vocalizing whales is unknown and signals must be selected carefully from the recording to minimize off-axis distortion. Using a new high-frequency stereo recording tag, we identified sequences of clicks with high dynamic range that we attributed to individual untagged whales scanning their sonar beams past the tag. This method produced sets of some 100 FM and buzz clicks with consistent and completely distinct parameters.

The FM search clicks are highly unusual among known toothed whales. Compared to the clicks of most dephinids, the Mesoplodon FM click is 3-10 times longer, and has a distinct FM upsweep covering almost an octave (Fig. 3). Compared to the clicks of Cuvier's beaked whale, Ziphius cavirostris, the only other toothed whale reported to produce FM clicks (Zimmer et al., 2005), Mesoplodon FM clicks have a lower center frequency and a wider sweep frequency range (1 octave as compared to about 0.4 octave). However, the sounds produced by these two ziphiid species are superficially similar and represent a new class of echolocation clicks amongst toothed whales. In contrast, the broadband Mesoplodon buzz clicks, used in the terminal phase of prey capture, are more similar to clicks produced by large delphinids, such as killer whales Orcinus orca, Risso's dolphins Grampus griseus and narwhals Monodon monoceros (Møhl et al., 1990; Au et al., 2004; Madsen et al., 2004).

Given the data selection method used here, some variability in parameters within click type (Table 1) can be expected due to the possible erroneous inclusion of a few off-axis clicks in the data sets. However, we argue that the data sets are large enough to represent a fair sample of the on-axis signals. The sampling-rate of the tag was adequate to characterize both click types, albeit with an underestimate of bandwidth in the case of buzz clicks, and we conclude that neither off-axis distortion nor the recording conditions can explain the broad differences between the observed signals.

Fig. 6.

Echogram during the start of a buzz showing the hand-off from regular clicks (those before click `0') and buzz clicks. Each vertical slice of the echogram contains the outgoing click (at TWTT 0) and the subsequent echo returns. Pixel darkness indicates amplitude (the darker the pixel the greater the amplitude). Of the three echo sequences apparent before the buzz, only one continues to be visible during the buzz. The hand-off distance for this target was 3.0 m using a nominal sound speed in seawater of 1500 m s-1.

That Mesoplodon produce FM signals while searching for prey and then switch signals during prey capture is somewhat similar to the situation for bats but is unprecedented in the limited body of literature for other toothed whales, warranting further examination. Foraging by echolocation involves the separate challenges of detecting, classifying and approaching prey items for capture, and different biosonar signals may well be preferred for each of these tasks. Nonetheless, any practical signal must be a compromise adapted to the environment and prey type of the animal within the biophysical constraints of available mechanisms for sound production and reception. In the following sections, we explore why and how a beaked whale might produce such distinct sounds in the light of what is known about biosonar systems in bats and dolphins.

FM clicks

Frequency modulated chirps are often used in human-made radars and sonars with limited peak power in an effort to increase the energy of the outgoing pulse without sacrificing range resolution (Woodward, 1953). A matched filter receiver, which effectively cross-correlates the returning echo with the signature of the emitted pulse, is used to improve range resolution in a process known as pulse compression. The observation that FM bats produce chirp-like signals led researchers to propose that they may incorporate processing similar to a matched filter within their auditory system (Strother, 1961; Simmons, 1971; Simmons, 1993; Simmons et al., 1979; Simmons et al., 1990). Although bats can perform certain range resolution tasks with an accuracy that well exceeds that possible with an energy detector (Simmons, 1973; Simmons, 1993), the matched-filter receiver hypothesis is contested on a number of levels (Menne and Hackbarth, 1986; Møhl, 1986; Beedholm and Møhl, 1998; Beedholm, 2006). Nonetheless, there is compelling evidence (Simmons, 1971; Simmons, 1993; Simmons et al., 1979; Simmons et al., 1990; Masters and Jacobs, 1989; Surlykke, 1992) that the combination of FM signals and some specialized auditory processing allows bats to achieve high enough ranging accuracy and resolution to home in on small targets despite ensonification with pulses of several milliseconds duration, corresponding to pulse lengths in air of a meter or more (Fig. 7).

Fig. 7.

Comparative sizes of predator, prey and echolocation signal for the big brown bat Eptesicus fuscus (left), Blainville's beaked whale Mesoplodon densirostris (center) and bottlenose dolphin Tursiops truncatus (right). The lengths of the signals are computed by multiplying the typical signal duration by the sound speed in the appropriate medium. If different length signals are produced during search and terminal approach, representative lengths of each are indicated.

The short broadband transients produced by dolphins are so different from the signals of FM bats that bioacousticians have proposed a different receiver strategy. Based on ranging and range resolution experiments, Au argues (Au, 1993) that the dolphin receiver operates as an energy detector with high time resolution achieved by transmitting short transients and receiving with a short [ca. 265 μs (Au 1993)] integration time. Hence, dolphins seem to use both less complex sonar signals and less complex auditory processing than do FM bats. In fact, the benefit of a more complex receiver would be small given the low time-bandwidth product of dolphin signals [ca. 0.15 (Au 1993)], whereas for FM bats, with their high time-bandwidth product signals, much is to be gained from auditory processing.

Despite the common ancestry of beaked whales and dolphins, it is possible that beaked whales have evolved auditory processing matched to their FM click signal, akin to the situation proposed for FM bats. In fact, some pulse compression of the upsweeping chirp may occur in the inner ear since the basilar membrane is tuned to high frequencies close to the oval window while lower frequencies must propagate further along the membrane before being detected (McCue, 1966; Yates, 1995). However, an FM signal alone does not imply the existence of pulse compression in the receiver, and there may be other, ecological or physiological selection pressures that have led Mesoplodon to develop these unusual signals. Doppler tolerance, ranging accuracy and range resolution are all factors that are likely to have played a role in the evolution of biosonar in FM bats and we consider here the likely impact of each of these on Mesoplodon.

Doppler shift is a relevant issue for bats due to their high closing speeds on targets that can themselves move rapidly relative to the speed of sound in air (Altes and Titlebaum, 1970). In comparison, the slow closing speed of Mesoplodon (ca. 1.5 m s-1, as evidenced by approaching sequences of echoes prior to buzzes) relative to the speed of sound in water, results in small (<100 Hz) Doppler shifts at the center frequencies of the FM and buzz clicks. Shifts so much smaller than the -3dB frequencies of the corresponding ambiguity functions [2.5 kHz for FM clicks and 17 kHz for buzz clicks (sensu Au, 1993)] are unlikely to be detectable by the whale (see also Herman and Arbeit, 1972). In fact any signal with duration and center frequency similar to the Mesoplodon clicks would be Doppler tolerant in this environment and this factor cannot explain development of the FM click.

The range resolution required by an echolocating animal to discriminate clustered prey depends on the size of its prey. As shown in Fig. 7, the relatively long duration sounds produced by Eptesicus fuscus while approaching prey in the open (Surlykke and Moss, 2000) may occupy 0.7-3.0 m of air, 2-3 orders of magnitude larger than the size of their prey. With these signals, echoes from clustered prey could easily overlap in time necessitating auditory processing that can recover range resolution by exploiting signal properties (for example by pulse compression). In comparison the length of the Mesoplodon FM click in water, 0.4 m, is closer to the size of their prey. Based on the stomach contents of two Mesoplodon stranded in the Canary Islands, this species preys on small deep-water squid, crustaceans and fish (Santos et al., in press) with size range around 5-30 cm, although the sonar cross-section of deep-water fish and squid will depend on their orientation and could be much smaller than their length (Medwin and Clay, 1998). If Mesoplodon have the sophisticated signal-dependent auditory processing attributed by some authors to bats, their range resolution at high ENR could be close to 4 cm, i.e., the product of the Woodward time resolution3 and one half the speed of sound. Conversely, for an energy detecting receiver, the range resolution with moderate ENR might be about 10 cm, corresponding to one half of the emitted pulse length [e.g. with Tursiops truncatus (Murchison, 1980)]. Thus, despite their long duration, the FM clicks provide a range resolution comparable to the size of typical prey items without the need for pulse compression in the receiver.

A similar argument holds for ranging accuracy. While the Mesoplodon FM click combined with a matched-filter receiver could give a ranging error as low as 4 mm for an ENR of 10 dB [(applying equation 10-11 from Au, 1993 (Au, 1993)], there is no reason to suspect that such accuracy is required for a 4 m whale to home on a 5-30 cm target. Again, the poorer performance of an energy detector should suffice. In practice, both ranging accuracy and the ability to discriminate clustered targets may be limited more by the ENR and the integration time of the receiver than by the characteristics of the outgoing signal.

The above outline suggests that Mesoplodon have fewer problems achieving Doppler tolerance, range resolution and ranging accuracy than do bats using their respective search signals in water and air. If an energy-detecting receiver can provide sufficient detection and localization performance for Mesoplodon as is the case for dolphins (Au, 1993), it would appear that the bandwidth of the FM click does not offer any advantage. Bandwidth, however, may play a crucial role in another aspect of echolocation. Recently, we demonstrated that a Mesoplodon ensonifies many more targets than it attempts to catch, and we proposed that the whales are selective foragers in a multi-species mesopelagic habitat maximizing the net energy return of foraging during long breath-hold dives. We also speculated that such selective foraging is likely based on identifying targets for predation by using prey-specific signatures in the returning echoes (Madsen et al., 2005b). Spectral and temporal cues have been shown to be important in classifying targets for both echolocating bats (Simmons and Chen, 1989; Schmidt, 1992) and dolphins (Au, 1993), albeit under controlled experimental conditions, and there is no reason to suppose that Mesoplodon would not also use this information. In this light, we propose that the FM clicks may represent a solution to the twin problems of (i) detecting prey of low target strength, requiring a high-energy signal, and (ii) discriminating between prey and non-prey in a cluttered multi-target habitat, requiring a broad bandwidth. With their 270 μs duration, FM clicks contain five times more energy than would a 50 μs dolphin click with the same peak pressure, resulting in a potential 7 dB increase in ENR, assuming that Mesoplodon have an auditory integration time similar to that of dolphins. However, such a long duration click with a constant carrier frequency (e.g. a long Phocoena click) would have an RMS bandwidth of only about 1.3 kHz, one fifth that of the FM click. Frequency modulation thus has the effect of preserving bandwidth in a long duration click. This seems to represent a different strategy than that adopted by dolphins where both wide bandwidth and high energy are achieved by producing short transients with high peak pressure. Au proposed that Phocoena clicks may be longer than those of delphinids to compensate for a speculated physiological limit in peak pressure (Au, 1993). It remains to be seen if Mesoplodon have a limited peak pressure for sound production, leading to the development of the observed FM click.

Buzz clicks

Buzz clicks are both shorter than FM clicks (105 μs as compared to 270μ s) and are apparently produced at a level some 15 dB lower than FM clicks (Madsen et al., 2005b) (present study). The energy of buzz clicks may then be about 1/100 (i.e. -20 dB) that of FM clicks. Reduced output is presumably acceptable given the low transmission loss to the close (mean distance of 3.6 m) target at the start of the buzz. Reduced output may even be advantageous, as it results in fewer unwanted echoes during the critical moment of prey interception. The short buzz clicks may also decay more rapidly than FM clicks, perhaps facilitating the detection of echoes from very close targets. Given the short duration of buzz clicks, an energy detecting receiver will provide at least twice the ranging accuracy for buzz click echoes than it will for echoes from FM clicks with the same ENR. However, the reduced output level of buzz clicks, and thus lower ENR of echoes, may in fact lead to a reduction in ranging accuracy at the start of the buzz, as compared to the preceding FM clicks, until the target is approached more closely. It is unknown whether this effect is mitigated by a narrower beam pattern in the case of buzz clicks or is compensated by averaging returns from successive clicks within the auditory processing.

The high repetition rate of buzz clicks means that the whale receives 300 or more potential updates on the target during the last 3 m of approach. The production rate of clicks within the few buzzes we could analyze had an intriguingly stereotyped form (Fig. 5). The stereotypy may indicate that all 10 of the prey approaches examined were carried out at very similar closing speeds, and that the ICI during the buzzes tracked the two-way travel time (TWTT) to the prey, as found with bats and trained dolphins. However, such a repeatable capture strategy contrasts with the apparent lack of ICI coordination in FM clicks immediately prior to buzzes (Madsen et al., 2005b). The stereotypy of the ICI in buzzes may also stem from physical constraints in the sound production system or be dictated by requirements of the echo processing system. Clearly, much remains to be discovered about signal production capabilities, perception and motor patterns during echolocation-mediated foraging in toothed whales. Despite these uncertainties, we can conclude that the two distinctly different biosonar signals produced by Mesoplodon are likely specialized to the tasks of detection and classification (FM clicks), and capture (buzz clicks) of low target strength prey in the deep ocean.

Sound production

Investigated species of toothed whales generate clicks by actuating one or both sets of monkey-lip-dorsal-bursae (MLDB) complexes below the blowhole (Cranford et al., 1996). Mesoplodon have homologous structures (Heyning and Mead, 1990) and there is no reason to suspect that they would not generate sound in much the same way as do dolphins (Cranford et al., 1996). It is therefore fair to ask how both long duration FM clicks and short transient buzz clicks can be made by a sound production system that has not been observed to produce modulated clicks in other toothed whales. While it is possible that the FM click is the result of the combined action of both MLDBs, this would require synchronization of the two complexes at the level of a few μs, which seems improbable. Two other possible explanations are that FM and buzz clicks can be produced by either MLDB, or that each MLDB is dedicated to produce only one of the two click types with, likely, the larger right hand MLDB producing the FM click. Although the latter explanation would account for the apparent lack of clicks with intermediate characteristics between FM and buzz clicks, it is unknown how, or even if, an FM waveform could be produced by an MLDB, nor can we explain why other odontocetes with similar levels of asymmetry to the Mesoplodon do not produce FM clicks. We have also been unable to detect a consistent difference in arrival angle between the first buzz clicks in a buzz and the FM clicks that immediately precede the buzz, as would be expected if the two laterally separated MLDBs are the sources of different click types. However, the angle difference would be small (about 3°) and thus difficult to detect in the complex waveform, containing body-conducted and reflected signals, that is recorded by a tag attached behind the sound source. Nonetheless, if FM and buzz clicks are produced by different MLDBs, this may help explain how buzz clicks can be produced with higher center frequency and bandwidth than FM clicks, despite being some 15 dB lower in level. In several dolphin species, the bandwidth and center frequency of clicks are positively correlated with source level (Au et al., 1995) and it appears that these species do not or cannot produce low level, high frequency clicks.

When recorded away from the acoustic axis, FM clicks appear to comprise several closely separated pulses. Given the depth at which these clicks are produced, the short time delays between components in the off-axis clicks cannot be explained by sea-surface reflections but are consistent with reflections from hard or air-filled structures within the head. In several clicks with more widely separated pulses, the individual pulses each appeared to have an FM form, reinforcing the notion that the FM click is generated by an MLDB and then reverberates within the head to produce the observed off-axis waveform. Curiously, this effect was not seen in off-axis buzz clicks, although the lower level of these clicks and their resonant characteristic may mask multiple arrivals.


This study has demonstrated that echolocating Blainville's beaked whales produce two distinct signal types that are intimately linked to different phases of detecting and catching prey with biosonar. Adaptation of sonar signals to different echolocation tasks during foraging is well documented for bats, but has not been demonstrated previously in toothed whales. The unusual search signals are long-duration FM pulses that carry more energy for the same peak pressure than would a conventional click while maintaining a high bandwidth. Buzz clicks, in comparison, are lower-amplitude, shorter-duration transients with high bandwidth. We propose that the FM signature of Mesoplodon search clicks has evolved to enhance the detection and classification of prey with low target strength while the short, low-energy, broad-band buzz clicks are adapted to provide higher target resolution and clutter reduction during prey capture. Despite the similarity between the FM search clicks and the cries from FM bats, the shorter duration of Mesoplodon clicks, coupled with larger prey size and faster sound propagation in water, suggest that these whales can achieve sufficient range resolution without the complex auditory processing attributed by some authors to bats. At a practical level, the unusual properties of FM search clicks may facilitate passive acoustic detection of Mesoplodon as a mitigation measure to reduce the impact of anthropogenic sound on this species.

List of abbreviations
analog-to-digital converter
constant frequency
digital acoustic recording tag
echo-to-noise ratio
frequency modulated
inter-click interval
narrow-band high-frequency
rigid-hulled inflatable boat
source level
signal-to-noise ratio
two-way travel time


Thanks to the field team in El Hierro: A. Bocconcelli, C. Aparicio, F. Díaz, I. Domínguez, M. Gamito, M. Guerra, A. Hernandez y A. Padrón. Thanks also to A. Brito, C. Militello, F. Rosa at the University of La Laguna, and to T. Hurst and K. A. Shorter at WHOI. We thank B. Møhl, K. Beedholm, M. Wahlberg, D. Mountain, J. Simmons and A. Surlykke for helpful discussions and/or constructive critique on earlier versions of the manuscript. Fig. 7 was prepared by J. Cook. Fieldwork was funded by the National Oceanographic Partnership Program (NOPP), Strategic Environmental Research and Development Program (SERDP) under program CS-1188, the Packard Foundation, the Canary Island Government and the Spanish Ministry of Defence. Field support was provided by the Government of El Hierro. P.T.M. is currently funded by a Steno Fellowship from the National Danish Research Council. Tagging was performed under a permit from the government of the Canary Islands granted to N. Aguilar de Soto. The Institutional Animal Care and Use committee at the Woods Hole Oceanographic Institution approved this research.


  • 1 This formula assumes a plane-wave front at the hydrophones, which is a reasonable approximation given the close hydrophone separation (0.025 m) and the distance to the sound source (about 1 m in the case of the tagged whale's sound source and considerably further for other whales).

  • 2 A shorter window and narrower analysis band are needed to analyze the lower SNR echoes from buzz clicks.

  • 3 The Woodward time resolution measures the spread of the autocorrelation function of the transmitted signal and so provides an indication of the time resolution possible in high ENR when using a matched-filter receiver (Woodward, 1953).


View Abstract