|
|
|
|||
| Home Help Feedback Subscriptions Archive Search Table of Contents | ||||
First published online August 18, 2005
Journal of Experimental Biology 208, 3349-3366 (2005)
Published by The Company of Biologists 2005
doi: 10.1242/jeb.01772
Evolving neural models of path integration
Centre for Computational Neuroscience and Robotics, School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK
* Author for correspondence (e-mail: robertvi{at}sussex.ac.uk)
Accepted 29 June 2005
| Summary |
|---|
|
|
|---|
Key words: path integration, Cataglyphis fortis, genetic algorithm, leaky integration
| Introduction |
|---|
|
|
|---|
The current paper addresses the possible neural mechanisms responsible for
PI and deals in a new way with the question of how the HV may be stored and
processed in the brain. We focus on the behaviour of Cataglyphis
fortis (Forel) ants, whose PI behaviour has been extensively studied
(Wehner, 2003
) and modelled
(e.g. Müller and Wehner,
1988
,
1994
). Neural models of C.
fortis PI were presented by Hartmann and Wehner
(1995
) and Wittmann and
Schwegler (1995
). Both are
based on different methods of encoding the HV chosen by the modeller, since no
neurophysiological data are available to suggest the actual mechanism
employed. We show that it is possible to produce neural PI models without
imposing a particular HV representation in advance, by using a genetic
algorithm (GA) to generate networks automatically. We also explore the
advantages of producing a complete model
(Chiel and Beer, 1997
) that
includes an explicit representation of the animal's behaviour in its
environment.
In common with many other species
(Maurer and Séguinot,
1995
), systematic PI navigation errors are observed in C.
fortis when it is forced to walk along L-shaped paths
(Müller and Wehner,
1988
). Ants leaving the L shape turn too much, and the degree of
inaccuracy in the homing direction is a function of the shape and size of the
L-shaped outward journey. Errors are not seen when the outward path is
straight, unless it is very long (Sommer
and Wehner, 2004
). Assuming the errors are not an artefact of the
experimental manipulation, there are several possible explanations, including
at least: (1) the ant has an accurate PI system but deliberately deviates from
the direct homeward path under certain conditions, (2) the system only
approximates accurate PI, and the degree of error depends on factors such as
the journey's size and shape and (3) the ant has a PI system that sometimes
operates in an accurate fashion and sometimes does not. In (2) and (3), the
properties of the ants' homing errors might be used to infer some information
about the PI mechanism, although (3) may give misleading results since the
mechanism is changeable. Explanation (1) cannot readily reveal anything about
the underlying PI mechanism but can still produce homing errors that vary
systematically with the journey shape.
One question is whether, as (1) and (3) above require, our current understanding of information processing in neurons suggests they are capable of performing accurate PI given the resources available in a C. fortis ant's brain. If they are not, then the errors show a basic limitation of the ant's brain, and (1) and (3) above are rendered unlikely. If we believe they are capable, we still have options (1), (2) and (3) available: (2) cannot be ruled out since selection may still only have produced an approximate system, provided it is sufficiently good.
Wittmann and Schwegler
(1995
) present a system that
can perform accurate PI using only simple and relatively few model neurons,
suggesting that the reason is not a limitation of neural processing. Kim and
Hallam's neural PI model supports this conclusion
(Kim and Hallam, 2000
). The
model employs a different mechanism but is also capable of accurate PI. The
WittmannSchwegler model can reproduce the ants' homing errors if the PI
process is assumed to start only after the ant has travelled a certain
distance from the nest, making it a type (3) model dependent on when PI begins
to operate.
Müller and Wehner
(1988
) suggest an algorithm
that performs an approximation of PI such that, after L-shaped journeys, the
ants' errors are reproduced, but under normal foraging conditions navigation
is suitably accurate, making it a type (2) model. This represents an
hypothesis that the foraging and PI systems have coevolved such that efficient
navigation is achieved without the need for exact PI. The mathematical
algorithm is argued to be more simple than an exact one. Hartmann and Wehner
(1995
) present a neural
network PI model that shares the property of only producing observable errors
after certain journey shapes. However, now that the system is built using
neurons, the model is no longer any simpler than an exact PI system
(Wittmann and Schwegler,
1995
).
Mittelstaedt's bicomponent PI model is a mathematical description of
accurate PI and, as such, does not generate systematic errors (Mittelstaedt,
1962
,
1985
). This feature has been
criticised (Hartmann and Wehner,
1995
), since the model cannot explain systematic errors or produce
any deviations from accurate navigation. This model has not previously been
implemented using model neurons. The results presented here show that a neural
implementation is possible and that the ants' systematic navigation errors are
then generated in a straightforward way using it. Our model has an integrator
time constant parameter that can be tuned to accurate or erroneous PI, making
it a type (3) model.
|
![]() | (1) |
is its compass heading,
(x,y) is its geocentric rectangular HV and t is time. In
polar geocentric form, we have:
![]() | (2) |
is its bearing from the
origin. Under a geocentric system, the HV expresses the animal's position
relative to a fixed reference point (e.g. its nest) that forms the origin of
the coordinate system. An egocentric system expresses the reference point's
coordinates relative to the animal's position and orientation, such that the
HV changes even if the animal is rotating on the spot (see
Fig. 1).
Homing
Many species use PI as a means of returning as rapidly as possible to their
nest or refuge. In an open environment, this naturally corresponds to a
straight path home from the current location. This can be formalised using an
equation describing the animal's rate and direction of rotation during homing
as a function of the HV. The form of this equation depends on the type of HV
used. Mittelstaedt's bicomponent model (Mittelstaedt,
1962
,
1985
), a geocentric
rectangular description, uses:
![]() | (3) |
We can plot x sin
and y cos
against
, holding (x,y) constant (see
Fig. 2A). d
/dt
will be positive where x sin
>y cos
and
negative where x sin
<y cos
, telling us
whether the animal will turn anticlockwise or clockwise, respectively, and
shows that there are two equilibrium values of
provided the HV is not
(0,0). One equilibrium is stable and points towards home, the other is
unstable and points directly away from home.
Fig. 2A also shows that the
animal will always turn to face home efficiently, i.e. through the smallest of
the two possible angles. The equivalent description using a geocentric polar
HV (see Fig. 2B), which is very
similar to the scheme used by Hartmann and Wehner
(1995
), is:
![]() | (4) |
|
The mathematical description of homing is dependent on the form of the HV;
similarly, the neural homing mechanisms will be dependent on the neural HV
representation. A parsimonious neural PI model should therefore give equal
weight to homing as to updating the HV. A feature of C. fortis PI is
that the HV continues to be updated during the homeward journey, allowing the
animal to detour around obstacles (Schmidt
et al., 1992
). Thus, HV updating and homing together constitute a
feedback process. A description of the behaviour of the animal is not properly
defined until both systems are in place. The present paper is concerned with
producing a complete description of this type. The task consists of an arena
devoid of landmarks (in keeping with the salt pan habitat of C.
fortis), such that PI is the only available strategy for homing. The
agent's (the model animal's) locomotion is restricted to the usual foraging
behaviour of the ant, i.e. walking only forwards, not sideways or backwards,
and employing the same class of sensory cues, namely an allothetic compass
(Wehner, 1994
) and idiothetic
odometer.
Existing neural models
We now briefly introduce two existing neural models of C. fortis
PI behaviour.
The HartmannWehner model
This model (Hartmann and Wehner,
1995
) uses neural chains to store and manipulate the HV. A neural
chain is a linear or circular chain of neurons whose pattern of firing
represents a linear or circular variable, respectively, in this case primarily
the distance and angular components of a geocentric polar HV. Each chain
neuron has two supporting neurons to allow the activity pattern to be
stabilised or modified. The distance chain (called the r-chain) works
like a thermometer. Each chain neuron can be either fully activated or fully
inactivated. Activity spreads from one end of the chain to the other as the
value increases, and recedes back again as it decreases. Only the position of
the boundary of the active region encodes information. The angular chain
(called the
-chain) works similarly, except that the chain is circular and
contains a group of adjacent active neurons. The number of active neurons is
fixed, but the activity can shift clockwise or anticlockwise around the circle
to represent changes to the stored value. The model has five more circular
chains. One, the
-chain, represents the ant's current compass heading.
The final four chains, called C+,
C, S+ and
S, are needed to calculate the value of
and apply approximations of the trigonometric functions
cosine and sine to them. Chains S+ and
S provide a homing signal. For further details, see
Hartmann and Wehner
(1995
).
The model reproduces the systematic navigation errors displayed by C.
fortis (Müller and Wehner,
1988
) by nature of the approximate PI mechanism it instantiates.
Nine free parameters of the model, controlling the relative phases and widths
of the activity regions on chains
and
, and thus the
approximation of cos and sin functions, are used to fit the model to the data.
It is also possible to make the model produce more accurate PI behaviour by
choosing different values for these parameters
(Wittmann and Schwegler,
1995
).
Since the model separates the polar HV into distance and angular
components, a potential discontinuity in
(the angular HV component)
arises at the HV zero point: if the animal were to pass over its estimate of
the home location, the angular component of the HV would have to jump by
180°. This might cause problems if the network were employed to model the
ant's systematic search behaviour, since PI must continue during searching,
which involves regularly returning to the HV zero point
(Wehner and Srinivasan,
1981
).
Chapman (1998
) reports
implementing and testing the model on a mobile robot, using approximately 900
simulated neurons. Flaws in the r-chain and homing mechanisms are
reported. After modifying the model to correct these flaws, the robot was able
to navigate successfully. This result shows that the basic mechanisms employed
by the model are valid and can accomplish PI outside of a simulation.
The WittmannSchwegler model
This model (Wittmann and Schwegler,
1995
) uses a sinusoidal array to represent its geocentric polar HV
as a phasor. The distance and angular HV components [called (r,
)] specify the amplitude and phase, respectively, of a sinusoidal wave,
which is represented spatially along a circular neuronal array. Using
N array neurons, the activity of neuron i is
k0[rcos(
+2
i/N)]+b0, where k0
and b0 are positive constants. The animal's compass
heading is assumed to be available as a symmetrical single-peaked activity
pattern on another circular array of neurons. Connections from the compass
array to the HV array, modulated by the animal's current speed, convert the
compass information into the same phasor representation as the HV. Simple
element-wise addition is then enough to update the HV correctly. Recurrent
connections within the HV array act as a memory and stabilise the shape of the
sine wave. Homing is achieved by comparison of the current compass and HV
arrays. For full details, see Wittmann and Schwegler
(1995
).
The ease of performing vector addition with a sinusoidal array is stated to
justify its usage in PI. Touretzky et al.
(1993
) implement a sinusoidal
array using a spiking neuron model. The array is a redundant encoding where
each unit of the array is made up of 100 spiking neurons, where a high
precision can be attained even if each neuron has relatively few
distinguishable firing rates and is subject to noise. However Touretzky et al.
state that the sinusoidal array has no advantage over a Cartesian encoding
unless its ability to perform vector rotation is used, which the
WittmannSchwegler model does not employ.
Like the HartmannWehner model, Wittmann and Schwegler fit their
model to the data from C. fortis
(Müller and Wehner,
1988
). The fit is visually as good as the former model's. Since
the network performs exact, error-free PI, an extra parameter and mechanism
were introduced specifically to produce errors. Whilst fitting to data using a
single parameter is arguably stronger evidence for a model than fitting with
nine parameters, the model does not generate navigation errors as a result of
its fundamental PI mechanism, as the HartmannWehner model does.
Evolving neural models
Both the HartmannWehner and WittmannSchwegler models are
built around neural representations of the HV that were chosen on the basis of
robustness. The HV representation cannot be based on experimental findings,
since no relevant data exist. The big differences between the models, and the
relatively large number of neurons used, suggest that many such PI networks of
similar competence could be constructed using various mechanisms.
We approach the problem of the neuronal mechanisms of PI here without making any a priori assumptions about the way the HV is encoded but rather use an automated network generation technique, the GA, to explore a set of possible networks that enable an agent to home using PI after an excursion. This procedure permits a less-biased exploration of possible neural mechanisms and also has other advantageous features for system design. We must fully and explicitly specify the sensory and locomotory capabilities of the agent, the information-processing capabilities of individual neurons and synapses, the sources of noise present and the behaviours we wish the agent to display before we can use the GA. The simulation code must explicitly generate the behaviour of each candidate PI model and agent in order to evaluate it. This reduces the chance of producing a flawed or fragile model. The GA is also able to prune the evolving networks to favour simplicity of neural structures. Finally, we can readily have the GA produce models under variations of the assumptions, such as the class of network model or the nature of the animal's internal compass representation.
The GA is not intended to closely model evolution by natural selection but it does capture one feature of PI systems, namely the potential chicken-and-egg problem of HV maintenance and HV-mediated homing. Unless both of these capabilities arise at the same time there is no selective advantage for either in isolation, at least in terms of PI navigation. A human-designed system need not suffer from this problem since both systems are constructed simultaneously and in their full complexity. The GA must develop and improve both capabilities at the same time for the system to operate, resulting potentially in a better mutual adjustment between them.
Evolved networks are often too complicated to understand using purely
analytical methods. Consequently, we may not be able to concisely state the
behavioural properties of an evolved PI system and may have to take a more
investigative approach, similar to that used in real-world biological systems
(Peck, 2004
;
Di Paolo et al., 2000
).
Neural mechanisms
The goldfish oculomotor neural integrator
(Major et al., 2004
) is a
mechanism that shares some properties with the HV in PI and is better
neurophysiologically characterised. These neural integrators perform temporal
integration of incoming velocity signals to track the position of the fish's
eyes. It is not yet known whether this is carried out at the single neuron
level (Shen, 1989
;
Loewenstein and Sompolinsky,
2003
) or using carefully tuned positive feedback within a
recurrent neural circuit (Seung,
1996
). Whatever the neural basis, functionally the integrator can
display leaky, stable or unstable behaviours
(Major et al., 2004
). This
suggests employing neural models capable of readily evolving similar behaviour
(at least as a subset of their behaviour) in our PI investigations. Recent
experimental findings (Egorov et al.,
2002
) show that single entorhinal cortex neurons are capable of
performing temporal integration of their inputs, expressed as persistent
graded changes in firing rate, at least over the course of tens of seconds.
Here, we use a standard leaky integrator firing rate model, the continuous
time recurrent neural network (CTRNN; Beer,
1990
; Dayan and Abbott,
2001
; Koch, 1999
).
Unlike traditional connectionist models, each neuron works in continuous time
and responds relatively faster or slower depending on its time constant
parameter.
Since CTRNN networks cannot easily perform multiplication, we also use an
augmented version, the modified CTRNN (ModCTRNN), introduced here for the
first time, which allows multiplication via changes in synaptic
strengths. Multiplication may be an important mechanism for PI. Both the
HartmannWehner and WittmannSchwegler models used multiplicative
or gating synapses to adapt PI to variations in the animal's speed. In his
bicomponent model of PI, Mittelstaedt
(1962
,
1985
) used multiplication to
generate homing behaviour. Multiplication has been conjectured
(Koch, 1999
;
Salinas and Abbott, 1996
) and
observed (Hatsopoulos et al.,
1995
; Gabbiani et al.,
1999
; Anzai et al.,
1999
; Andersen et al.,
1985
) to be carried out by many neural mechanisms. We investigate
whether the ModCTRNN model results in a significant decrease in overall
network complexity or an increase in performance over the CTRNN model.
| Materials and methods |
|---|
|
|
|---|
The path integration task
For each trial, the agent started at the nest with a random orientation and
was presented with a series of between one and three visual beacons that it
was required to visit (defined as approaching to a distance of 0.01 or less)
by phototaxis. Each beacon was immediately removed when the agent reached it,
and the next one activated. Beacons were placed by selecting a random distance
from the nest in the range [0.5,1.0] and random angle from the nest in the
range [0,2
] radians, except the last beacon's angle, which was selected
using a stratified random scheme that divided [0,2
] into 10 equal-sized
blocks, one for each trial, thus ensuring a more even coverage of final HVs
per evaluation. After the last beacon was removed, the agent's orientation was
randomised (to prevent it homing by simply turning through
radians for a
one-beacon trial) and it was required to return to the nest (again defined as
approaching within 0.01 distance units) using only its compass sensors, speed
sensor and internal state, i.e. the nest could not be directly detected by the
agent. During a given generation, all 30 agents were presented with the same
10 sets of beacon locations and given the same initial orientations at the
nest and were subjected to identical noise. This made fitness comparisons less
subject to noise.
Each experiment was initially performed with the agent's speed fixed at maximum, so that a solution could evolve without requiring the speed sensor. If the GA produced a satisfactory solution, the experiment was repeated with the agent's speed no longer restricted to maximum. The agent was now also held captive (stationary) for a randomly selected time drawn from [0,0.5] upon reaching the last beacon before being allowed to attempt homing. If this task was solved, the experiment was continued with an increase in the level of noise on the motor outputs (see below), so forcing the agent to take full account of the speed sensor in order to perform accurate PI.
Calculating fitness
The fitness assigned for each trial is calculated by one of four equations
depending on how many phases of the task were completed by the agent. The
overall fitness is always between zero and one, and the fitness for reaching
each new phase is equivalent to achieving optimal fitness in all previous
phases. Firstly, if the agent failed to reach the first beacon within twice
the time it would have taken to visit all of the trial's beacons by a direct
course at maximum speed the trial was ended and the fitness was
0.25/(1+
B), where
B is the time integral of
the agent's Euclidean distance from the first beacon over the trial (a measure
of the average distance from the beacon). Secondly, if the agent reached the
first beacon but failed to reach all subsequent beacons within the above time
limit, the trial was ended and the fitness was
0.25+0.25xvisited/total. These two criteria simply
facilitate the evolution of phototaxis. Thirdly, if the agent visited all
beacons but failed to reach the nest within three times the minimum time
required from the location of the last beacon at full speed, the trial ended
and the fitness was 0.5+0.25/(1+
N), where
N
is the integral of the Euclidean distance of the agent from the nest starting
from the moment it arrived at the last beacon until the end of the trial. If
the agent returned to the nest, its fitness was
0.75+0.25/(1+tN), where tN is the time
taken to complete the entire trial; thus, the fittest possible agent would
visit all beacons and then return to the nest using straight, direct paths for
all legs of the journey and travel at full speed. Owing to sensor noise,
cumulative navigation errors would arise, meaning that an ideal agent would
also search efficiently for the nest after reaching the HV zero point until it
found the nest or until the trial timed out.
The agent
The agent was modelled as having a position
(xA,yA) on an unbounded
two-dimensional plane, with orientation
A radians measured
positive anticlockwise from the x-axis (or `east'; all angles are
defined positive anticlockwise from the x-axis or sometimes from the
agent's body axis). The origin (0,0) of the plane was defined as the nest
location. One distance unit was defined as the maximum distance the agent was
ever required to travel from the nest. Maximum speed was also unity; the agent
therefore took at least one time unit to reach maximum distance from the nest.
The agent had two beacon sensors (for phototaxis), two (sometimes more)
compass sensors, one speed sensor, one food sensor, two rotation motors and
one forward motor. The agent's motion was restricted to rotation and forward
translation. No backward or sideways movement was allowed, in keeping with the
usual behaviour of foraging ants. The agent was assumed to have low inertia;
motor output therefore specified forward speed and rotation directly, rather
than supplying forces. Speed was controlled by the forward motor neuron firing
rate (firing rate is defined in the next section), F, and rotation by
the two opposing left and right rotation motor neuron firing rates,
RL and RR:
![]() | (5) |
The agent's body, consisting of sensors, motors, neurons and links, was constrained to bilateral symmetry. Aside from single-copy components such as the speed sensor and forward motor neuron, all components were created as symmetrical pairs, including all other neurons, sensors and links.
Compass sensor neurons had an activation function, f, which responded
maximally to a particular agent orientation,
A=a,
and declined in a symmetrical way either side of this value. Each pair of
compass sensor neurons had complementary values such that if one responded
maximally at
A=a, the other responded maximal to
A=a. This could be considered as
the response of light sensors to a distant light in the east. All activation
functions define the stimulusresponse properties of the sensory neuron
in terms of its firing rate (see next section). The function f was varied
between experiments and was mostly based on the cosine function (see
Fig. 3). As can be seen, some
functions have negative as well as positive firing rates. This unbiological
feature can be removed by replacing each compass sensor with two sensors, each
giving a complementary half-wave rectified output (see Discussion). Negative
values were used to simplify the analysis of the evolved networks. The agent
was either given one pair of compass sensors with maximum responses set to
A=±
/4 or had an evolvable number of sensor pairs
with evolvable maximum response parameters. The pair of beacon sensor neurons
was defined as having activation [cos(
B
/2)/2]+0.5 for the left sensor and [cos(
B+
/2)/2]+0.5
for the right, where
B is the angle to the current beacon
from the agent's body axis (the activation is therefore independent of
distance to the beacon). Additionally a `food' sensor neuron was used: its
activation was 0 before reaching the final beacon (where the agent might be
imagined to have collected an item of food to motivate its return to the nest)
and 1 thereafter (this sensor was ignored by most of the evolved networks).
The speed sensor neuron gave a proprioceptive measure of the agent's current
speed, including motor noise, linearly mapped to the range [0,1]. Noise was
applied to all initial neuron states (vt=0), sensors and
motors by adding a uniformly distributed random offset in the range
[
,
]. The offsets for sensors and motors where held constant
across multiple numerical integration steps (see Appendix 1); changes in value
were triggered using a Poisson process rate r (the average number of
events per unit time) to calculate the time of the next value change:
tchange=[ln(1x)/r]+t,
where x is a uniformly distributed random value drawn from the
interval [0,1], and t is the current time. This scheme makes noise
effectively independent of the numerical integration step size used. The
values of
and r could be set for each sensor/motor type;
=0.01 and r=20.0 were used initially for all sensors and motors;
for some experiments, this was increased part way through for the forward
motor neuron to
=0.7, r=2.0 and for the rotation motor neurons
to
=0.1, r=20.0 in order to impose large long-lasting variations
in the agent's velocity. Overall, the presence of noise in the simulation is
intended to promote the evolution of robust PI solutions and to cause small
cumulative navigation errors, as occur in natural PI.
|
Neural networks
The control network used was either a CTRNN (with fixed weights) or a
ModCTRNN (with modifiable weights, allowing multiplication). The overall
network architecture was always bilaterally symmetrical but was not otherwise
constrained. It was not prevented from forming recurrent connections or forced
to be fully connected. The GA fully determined which components were linked to
which other components. This had the advantage of allowing a form of
stochastic evolutionary pruning to remove redundant components where the GA
was left running with its addition operator disabled but deletion operator
enabled (see Appendix 1 for details).
The state equation used here for a CTRNN neuron is:
![]() | (6) |
i is a time
constant, vi is the neuron state (analogous to a membrane
potential), wj is the link weight and
zj is the activation of the sensor or neuron attached to
link j. For a neuron,
zj=1/{1+exp[(vj+bj)]},
where bj is a bias parameter. For a sensor,
zj is the current activation. The initial value of the
membrane potential, vi,t=0, was also encoded for each
neuron. z represents a dimensionless firing rate, i.e. a firing rate
divided by its maximum possible rate. Since the model does not generate
explicit spikes, it could alternatively be considered to represent the
dimensionless membrane potential of a non-spiking neuron. Synaptic weights,
w, in the CTRNN model are fixed, having no dynamics or plasticity of
any kind.
The ModCTRNN uses the same state equation for its neurons but also applies
an additional analogous leaky integration equation to its weights, changing
them from fixed parameters into potential variables. This is achieved by
allowing links to point to other links and input a signal that can modify the
target link's weight:
![]() | (7) |
is a time
constant, ß is a bias term, wj is a link weight and
zj is the firing rate of the neuron or sensor attached to
link j. All weights wi are initialised to
ßi, therefore any weights that receive no inputs remain
constant at this value throughout a trial. A link acting to modify the weight
of another link can itself be the target of modification, and so on, allowing
an arbitrary degree of higher-order weight changes to take place, limited only
by the number of links available to the GA. See Appendix 1 for the parameter
ranges allowed. The ModCTRNN equation is capable of producing effects similar
to synaptic depression or facilitation but should not be considered as a
detailed realistic model of synaptic plasticity.
Experiments
Experiment 1A
The agent was given one pair of compass sensors giving a cosine-shaped
response (see Fig. 3). The
CTRNN model was used. The agent always moved at its maximum speed. Once the
agent had evolved to solve the task reliably, the GA was run in pruning mode
until the size of the genotypes stabilised, indicating that most or all
network redundancies had been removed.
Experiment 1B
As experiment 1A except that the agent's speed was not fixed at maximum and
the agent was held stationary for a random interval at the final beacon.
Experiment 2A
As experiment 1A except that the ModCTRNN model was used.
Experiment 2B
As experiment 2A except that the agent's speed was not constant and the
agent was held captive for a time at the last beacon. Once the agent had
evolved a good fitness, a further perturbation was introduced by setting
=0.7, r=2.0 for the forward motor neuron (forcing large
long-lasting changes to the agent's speed via the noise offset
mechanism) and
=0.1 for the rotation motor neurons. Most of the Results
section is dedicated to an in-depth analysis of the network evolved in this
experiment.
Experiment 3A
As experiment 2A except that compass sensor responses were of the `linear
cosine' type (see Fig. 3) and
the number and maximum response direction of the compasses were allowed to
evolve (up to a maximum of 10 compass pairs).
Experiment 3B
As experiment 3A except that the agent's speed was variable, as in
experiment 1B.
Experiment 4
As experiment 3A except that the compass sensor response was the `head
direction cell' type (see Fig.
3), intended to model the response properties of head direction
cells in rats (Taube,
1997
).
Experiment 5
As experiment 3A except that the compass sensor response was the `positive
cosine' type (see Fig. 3).
Three replicates were performed for each experiment. Results are described for the most successful of the three runs.
| Results |
|---|
|
|
|---|
|
Experiment 1A: CTRNN constant-speed PI
The fittest agent after approximately 67 000 generations returned to the
nest within the time limit in 989 of 1000 test trials. The bilaterally
symmetrical network (see Fig.
4) contained 12x2 links (of which two pairs share the same
source and target and so are not visible in the figure) and 3x2
interneurons. The shape of the return journey was highly dependent on the
bearing to the nest from the last beacon and was generally not straight or
direct (see Fig. 5 for one
example) but usually consisted of two or more phases characterised by
different patterns of oscillation in four of the neurons (shown in white in
Fig. 4;
Fig. 6 shows the neural
dynamics), resulting in various looping and zigzagging behaviours. The
non-oscillatory neurons act as integrators of the compass sensor inputs. Their
output and the current compass sensor output feeds into the oscillatory group.
A plot of fitness against the angle to the beacon from the nest for 360 single
beacon trials (data not shown) showed a clear sinusoidal-shaped relationship,
caused by the agent taking longer to return to the nest from some regions than
others. This was clearly a suboptimal solution that did not home as C.
fortis does in a straight line. This network was therefore not analysed
in any further detail.
|
|
|
Experiment 1B: CTRNN variable-speed PI
This experiment failed to produce any successful agents after running three
independent populations for 35 000 generations. Seeding the GA with the
fittest genotype from experiment 1A also failed for three populations after 65
000 generations.
Experiment 2A: ModCTRNN constant-speed PI
Experiment 2A produced good results and, although evolved independently,
produced a very similar network to the 2B experiment. The 2A results are not
presented in any further detail here, since they do not add anything to the 2B
results.
The 2A experiment was also repeated using the same settings except the
maximum time constant values (
and
) allowed were 0.01, thus
preventing neurons and weights from acting individually as integrators, but
the GA failed to find any solutions.
|
|
To test for artefacts resulting from numerical integration, 500 trials were
performed using an integration step size of 0.001 (as during evolution) and
500 more with 0.0001; the agent reached the nest in 488 and 489 trials,
respectively, a non-significant difference (
2=0,
P>0.1; chi-squared test).
To test whether the ModCTRNN network was significantly better able to evolve solutions to this task than the CTRNN model, five more replicates of experiments 1B and five of 2B were performed for 60 000 generations each, evolving the ability to return home from a single randomly placed beacon with full motor noise applied. None of the 1B runs produced an agent that returned to the nest in any of 1000 test trials, but for all 2B runs (some of which were stopped early when it became clear PI had evolved) the agent returned home in greater than 50% of 1000 test trials. Taking a random course from the beacon would result in between 1.6 and 3.2 returns to the nest per 1000, therefore zero is worse and 500 much better than this random strategy. Five failures for the CTRNN and five successes for the ModCTRNN is a statistically significant difference (P<0.01; Fisher's exact test).
Following Mittelstaedt's bicomponent model
(Mittelstaedt, 1985
), this
experiment assumed a sinusoidal compass sensor response function. The two
compass sensors (CL and CR; later
abbreviated as CL/R) gave responses of
cos(
A+
/4) and cos(
A
/4),
respectively, where
A is the agent's heading anticlockwise
from the x axis. Taking
=
A+(
/4), we
obtain CL=cos
and
CR=sin
, exactly as Mittelstaedt used. It
is therefore sufficient to integrate these values, multiplied by the agent's
speed, to obtain an HV in geocentric rectangular form (see Eqn 1), and this is
exactly what the network does (see Fig.
8; Table 2). Links
wL3/R3 integrate CL/R, respectively,
and store the values contralaterally in the values of weights
wR2/L2, respectively. Since weights
wL3/R3 are themselves governed by the speed sensor,
via links wL4/R4, their value is the agent's
speed multiplied by the value of wL4/R4 and they act to
multiply the compass values by the current speed before they are integrated to
the HV.
|
Weight values wR2/L2 therefore constitute the HV, but they have a sufficient degree of leakiness over the agent's journey that significant homing errors would occur if nothing acted to compensate. Their time constants are 8.44, much less than the maximum allowed value during evolution of 1000. Compensation is achieved by normalising the values of wL4/R4 to approximately match the amount of leakage that wR2/L2 have undergone since the start of the journey. This is a simple, fixed-rate exponential decay function, visible in Fig. 9 in the values of wL4/R4 and wL3/R3, and is caused by the fixed output of the forward motor neuron via fixed weights wL5/R5 acting to reduce the magnitude of weights wL4/R4.
|
Once the last beacon is reached, the beacon sensors become inactive,
phototaxis ceases and the HV is free to control the agent's behaviour. The HV
causes the agent to turn towards home using an instantiation of, once again,
Mittelstaedt's bicomponent model
(Mittelstaedt, 1962
; Eqn 3),
where the two speed-weighted compass sensor integrals (i.e. the HV) are
multiplied by the current value of the contralateral compass sensor. These
values are fed into the two opposing rotation motors,
RL/R, by links wL2/R2. The agent's
rotation is the difference between these two motor outputs, completing the
instantiation of Mittelstaedt's homing model.
The initial phototactic stage of the agent's journey is achieved by
ipsilateral links wL1/R1, which cause positive phototaxis
(Braitenburg, 1984
). These
links feed into RL/R with higher weight values than
wL2/R2 and so suppress homing behaviour until the last
beacon disappears. Thus, homing begins immediately and automatically as soon
as the last beacon disappears, without any further trigger (the network
ignores the food sensor). When the home vector becomes very large, phototactic
suppression of homing may be incomplete and noise can cause positive
phototaxis to be temporarily lost before the beacon is reached. The agent
adopts a negative phototactic or sometimes homeward trajectory (see
Fig. 10) before returning to
positive phototaxis. This shows an interaction between the agent's two main
modes of behaviour, which, we speculate, could be selected for to allow the
agent to ignore beacons that were too far from its nest.
|
Simplified analytic model
A simplified analytic model (SAM) of the evolved network was used to
compare its behaviour with that of C. fortis (see Appendix 2 for full
details). All internal stateful dynamics of the network were removed except
for the leaky integration of the HV. The leakage normalisation mechanism was
removed (provisionally justified by assuming it approximates the behaviour of
an unnormalised but less leaky network). The SAM also assumes that the agent
travels in straight lines to visit all the beacons in turn before turning
immediately to adopt the homing direction determined by the homing mechanism
and the HV state. The SAM then produces a straight run to the point where the
HV is zero. As the integrator time constants are increased, the system
converges on an accurate PI system. For smaller values of the time constant,
the HV no longer defines a fixed point on a fixed geocentric coordinate
system. We can say rather that the agent has reached its estimate of home when
the HV has reached zero (the `HV zero point') and that the location of this
point is a function of the agent's outward journey shape and its speed during
both exploration and homing. Information stored in the HV for longer has
undergone more decay, giving rise to systematic errors returning from straight
and L-shaped journeys that closely resemble the systematic errors seen in
C. fortis ants once the SAM's integrator time constant,
, has
been fitted to the data.
L-shaped routes
Müller and Wehner
(1988
) conducted an extensive
series of experiments testing the homing behaviour of C. fortis after
L-shaped outward excursions. They showed that the error in the heading of the
ant's homeward run (i.e. its angular deviation from a direct homeward path)
varied systematically with features of the outward journey, specifically the
length of the first straight section, the angle turned moving from the first
to the second section and the length of the second straight section. The data
are reproduced in fig.
11ac from Hartmann and Wehner
(1995
) and were used to fit
the SAM's
time constant parameter to the behaviour of C.
fortis (see Fig. 11).
was fitted to the first of the three experiments, and the remaining
experiments were used to check the model's generalisation. The fit is visually
as good as that achieved by both the HartmannWehner and
WittmannSchwegler models. It should be borne in mind that the former
was designed from the outset to mimic the data it was fitted to and that the
latter was modified from its original form also specifically to fit the data.
The SAM, however, was derived from a network evolved under selection for
optimal PI and had all but one free parameter removed before fitting it to the
C. fortis data. This makes leaky integration a strong candidate for a
mechanistic explanation of the observed errors.
|
Long straight routes
Sommer and Wehner (2004
)
showed that, on straight journeys, C. fortis tends to underestimate
the length of the return leg as an increasing function of the outward journey
length. They fitted several models to this `error function' and concluded that
a leaky integrator function was one of the best fits (the other was a
logarithmic function). The leaky integrator equation fitted to the data was:
![]() | (8) |
and ß are free parameters. This is similar to the
model presented in Mittelstaedt and Glasauer
(1991
![]() | (9) |
Eqn 9 corresponds to the integrator state upon reaching the feeder or to
the assumption that the ant measures the outward journey using a leaky
integrator but turns the leak off for the return journey. It is known,
however, that in the species being modelled, PI continues during the return
journey, due to the behaviour of C. fortis after forced detours
(Schmidt et al., 1992
). In
keeping with this result, we assume that integrator leak is constant
throughout both legs of the journey, giving us a single parameter error
function of (see Appendix 3):
![]() | (10) |
This model can be obtained by reducing the SAM to the case of a straight
journey at constant speed and is similar to the neural model evolved by GA in
Vickerstaff (2003
). Once
leakage is assumed to be constant, we still have at least two options for a PI
system: continuous and discontinuous
(Collett and Collett, 2000
).
The continuous model leads to Eqn 10. In the discontinuous model, the
integrator state at the feeder is stored, then zeroed. The compass response
function is then inverted, and the ant navigates towards the point where its
integrator state is the same as the stored value. This leads to a prediction
of y=x (no systematic error) regardless of the value of the
integrator time constant. The shape of the SommerWehner data clearly
favours the continuous model over the discontinuous model under the leaky
integrator model presented here. Fig.
12 shows Eqns 9 and 10 fitted to the data, plotted alongside Eqn
8. The three models are visually indistinguishable. Once again, the model
presented here contains fewer parameters but fits equally well compared with
existing models that were explicitly designed [in the case of Sommer and
Wehner's model by a process of searching for the best model from a set of four
candidates (Sommer and Wehner,
2004
)] to fit the data. The
time constants arrived at for
the L-shaped (18.38) and straight journeys (193.6) are clearly very different
(see Fig. 12); consequently,
if we wish to use the same model to explain both data sets, we must propose
that the effective integrator time constant can vary according to some unknown
factor(s). This is in keeping with a type (3) model (see Introduction) but
also reduces the model's parsimony unless there is a natural reason for such a
variation to occur.
|
Search patterns
One notable feature of the homing equation (Eqn 3) is the behaviour
produced after reaching the HV zero point. If we assume that the animal
continues to move forward at a fixed speed at all times, it will pass directly
over its estimate of home, thereby switching from the stable to the unstable
equilibrium of the equation since it will now be facing directly away from the
zero point. Any slight perturbation will turn it back round onto a new stable
homeward trajectory, causing it to again cross over the home position and
begin another outward loop, and so on. The result is that it would continue to
loop around its current HV zero point. The exact behaviour produced would
depend on the amount of noise present and the animal's rate of turn but would
consist roughly of a series of equal-sized loops taking the animal away from
and back towards the HV zero point, very roughly as is seen in
Cataglyphis ants (Wehner and
Srinivasan, 1981
). Thus, the homing equation itself appears enough
to generate a crude search behaviour when combined with a fully specified
model of the animal's motion. Kim and Hallam
(2000
) have suggested that the
homing mechanism of PI might be enough to also explain searching
behaviour.
|
99%
of trials, the third fitness criterion, penalising higher average distances
from the nest during homing if the agent did not reach home before the end of
the trial, will not have been significant during the final stages of the
network's evolution, allowing broad search patterns to evolve if
necessary.
To compare the experiment 2B agent's search with that seen in C.
fortis, the average search density of 50 trials returning from a beacon
0.75 distance units east was plotted (Fig.
14), showing an approximately symmetrical bell shape. C.
fortis (Wehner, 1992
) and
other Cataglyphis species (Wehner
and Srinivasan, 1981
) show a similar search profile. Additionally,
the ant performs a broader search pattern the longer the preceding outward
journey was (Wehner, 1992
) and
therefore the greater its uncertainty about the home location. Also, during a
single search, the ant gradually increases the width of the pattern the longer
it has been searching (Wehner and
Srinivasan, 1981
). The 2B network effectively has a built-in timer
in the exponential decay of weights wL4 and
wR4, which influences the magnitude of its HV and which
appears to be the only mechanism that might influence the search pattern width
over time. In fact, the timer could roughly reproduce both features of the
ants' search, provided only the search width is inversely related to the
magnitude of the decaying weights, since lengthier journeys take longer to
complete on average. The ants' search pattern stays centred on the same
location during a search (Wehner and
Srinivasan, 1981
) but that of the agent may begin to drift around
due to cumulative HV errors, which must therefore also be accounted for when
characterising its search behaviour.
|
A plot was made of the average distance of the agent from the fictive nest position during the return and search phases of 200 journeys where the agent was returning from a single beacon 0.75 distance units away in a random direction, (Fig. 15). The nest had been removed and each trial the agent was allowed to search for three times as long as during evolution. The agent's HV error was also estimated during the 200 runs (1) by taking a running average of the agent's position over a time window long enough to contain multiple search loops and (2) by using the SAM to derive the approximate position where the agent's HV (weights wL2 and wR2) would be zero were it to move to that location in a straight line (see Appendix 2). Since these were in good agreement (data not shown), only the second estimate was used. On the plausible assumption that the search efficiency cannot be increased by deliberately introducing HV errors, only the agent's distance from its estimate of the nest location can be considered as evidence of an evolved search behaviour similar to the ants. This distance is obtained by subtracting the HV error vector from the agent's coordinates. The average distance from the nest, the HV error and the agent's distance from the HV zero point all increase during the search period, suggesting that both random and systematic increases in search width are at play (see Fig. 15).
|