Unmanned aerial systems (UASs), frequently referred to as ‘drones’, have become more common and affordable and are a promising tool for collecting data on free-ranging wild animals. We used a Phantom-2 UAS equipped with a gimbal-mounted camera to estimate position, velocity and acceleration of a subject on the ground moving through a grid of GPS surveyed ground control points (area ∼1200 m2). We validated the accuracy of the system against a dual frequency survey grade GPS system attached to the subject. When compared with GPS survey data, the estimations of position, velocity and acceleration had a root mean square error of 0.13 m, 0.11 m s−1 and 2.31 m s−2, respectively. The system can be used to collect locomotion and localisation data on multiple free-ranging animals simultaneously. It does not require specialist skills to operate, is easily transported to field locations, and is rapidly and easily deployed. It is therefore a useful addition to the range of methods available for field data collection on free-ranging animal locomotion.
Many studies require data on the location of individual or groups of animals, including habitat use, animal biomechanics and intra- and inter-species interaction. A number of methods are available for studying locomotion and localisation, ranging from fixed camera methods to wildlife tracking collars. Here we evaluate whether an unmanned aerial system (UAS) could be used to collect locomotion data comparable in accuracy to those from a state-of-the-art GPS-IMU (inertial measurement unit) collar.
Optical measurements are used for localisation in a number of fields, and include particle image velocimetry methods (Bomphrey, 2012; Hubel et al., 2009), passive marker stereophotogrammetric systems such as Qualisys, and multi-camera stereoscopic reconstruction techniques (Theriault et al., 2014; Hedrick, 2008). In each method, the volume within which the measurements are made must be carefully calibrated and cameras must remain in fixed positions (Hedrick, 2008). In studies of locomotion, multiple time-separated positions can be differentiated to give velocity and again to give acceleration. Optical methods are therefore increasingly used in a laboratory setting using video equipment and multi-camera stereoscopic reconstruction techniques, enabled by a new generation of small, low-cost, high frame rate cameras. Although precise camera positioning and calibration of the experimental volume is feasible for controlled research and sport, these laboratory-based methods are limited to captive animals. They are much less suitable for data collection in a field setting because of the variable environment and unpredictable nature of wild animal movements. There are practical difficulties with getting terrestrial cameras quickly into fixed positions, achieving a clear field of view and using passive marker systems in bright sunlight. These considerations limit the flexibility of fixed camera methods in free-ranging locomotion studies.
The relatively small number of studies that have collected locomotion data from free-running wild animals have largely relied upon the attachment of loggers and/or collars containing GPS and inertial sensors to the animals (Wilson et al., 2013). A typical low-cost GPS receiver calculates standalone horizontal position to a quoted accuracy of 2.5 m [see https://www.u-blox.com/sites/default/files/products/documents/LEA-6_DataSheet_(GPS.G6-HW-09004).pdf], though this is degraded under high dynamics (often encountered during animal locomotion) and attenuation/multipath effects that occur with tree cover. The high power requirements of GPS receivers also typically limit the duty cycle to maintain a useful battery life. This gives lower effective navigation update rates that may be insufficient for locomotion studies where detailed knowledge of path or distance travelled is required. Other issues may make collars unsuitable for some locomotion studies. Collar fitting commonly involves field anaesthesia, which carries a risk to the animal; the weight of the collar must be low enough [5% body weight is the generally accepted limit (Aldridge and Brigham, 1988), although efforts to reduce to <2% are common] not to alter the natural behaviour and locomotion of the subject. Where interactions such as hunting are of interest, there is also reliance on chance that interactions are observed between collared animals; this is more likely for predators but less so for prey. A method of collecting comparable data without the need for attached devices would therefore offer many advantages and increase the opportunities to capture data from a wider range of subjects.
UASs, also known as ‘drones’, have advanced rapidly in recent years. They have become increasingly affordable, with sophisticated and reliable control systems as well as high-end optics and gimbals as standard. The wide availability and low cost of the systems has enabled application of UAS technology to many data collection situations. There are many online examples of UAS video footage, which demonstrate the ability of such platforms to capture unconstrained locomotion. Their use in biology, geosciences and wildlife research has increased (Vermeulen et al., 2013; Le Maho et al., 2014; Rodriguez et al., 2012; Sarda-Palomera et al., 2012), but has been largely limited to ecological or behavioural studies. Fixed camera techniques are, however, not feasible with a UAS platform. The UAS does not remain stationary during flight because of wind perturbations, and despite platform and gimbal stabilisation that correct for position and attitude changes, there is inevitably some movement of the camera relative to the ground as a result of sensor error and response latency.
- ground control point
- inertial measurement unit
- interquartile range
- root mean square
- unmanned aerial system
- unmanned aerial vehicle
Calibration of the field of view is required to extract meaningful locomotion measurements from aerial video. Here we propose methods of calibration for each frame using fixed markers on the ground (visible in the video footage at all times) to compensate for changes in UAS platform displacement and orientation relative to those markers. We then evaluate the accuracy of these methods in position, velocity and acceleration extraction using video from a typical UAS in a simulated tracking scenario. We also assess the best calibration method and the optimum number of fixed markers.
MATERIALS AND METHODS
The aerial platform used was a Phantom 2 (DJI, Shenzhen, China) quadcopter, fitted with a Tarot 2D gimbal (Tarot, Zhejiang, China) and GoPro Hero 3 Black Edition (GoPro Inc., San Mateo, CA, USA) recording at 720 progressive resolution and 60 (59.947) frames s−1 in narrow mode. A 5.8 GHz 25 mW video transmitter and a DJI iOSD mini (DJI, Shenzhen, China) were added to facilitate framing of the video and enhance pilot control. A custom GPS logger with GPS module (u-blox LEA-6T, u-blox, Thalwil, Switzerland) equipped with an AP.25E.07.0054A active patch antenna (Taoglas, Co. Wexford, Ireland) was added externally to record L1 pseudorange, carrier phase and Doppler measurements at 5 Hz.
Twenty-five white circular dinner plates, 0.27 m in diameter, were used as ground control point (GCP) markers. Any fixed marker easily visible from the air, including natural features, could, however, be used. The GCPs were laid in an evenly spaced five-by-five grid of approximately 40×30 m. All GCP positions were then determined with a GPS survey system.
The UAS was flown manually and always within line of sight. Video was recorded with the gimbal oriented to the nadir and with the UAS at an altitude of 25–30 m; position and altitude were judged and corrected by the pilot both by direct visualisation and using the first person view (FPV) screen. The field of view of the GoPro in narrow mode is 64.6 deg horizontally, 49.1 deg vertically and 79.7 deg in the diagonal (GoPro, https://gopro.com/support/articles/hero3-field-of-view-fov-information); recording at 720 p there are 1280×720 pixels (16:9) at 60 frames s−1 (GoPro, https://gopro.com/support/articles/hero3-faqs). At 30 m altitude looking vertically down, this equates to 37.9 m horizontal and 27.4 m vertical ground sampling distance (1040 m2 footprint), which corresponds to 0.04 m per pixel vertically (relative to camera axis) and 0.03 m per pixel horizontally. Once the UAS was in position, a human subject fitted with GPS survey equipment walked paths across the grid. Twenty-eight trials were recorded, a trial being one random path through the GCP grid.
The survey system consisted of a NovAtel FlexPak-G2L containing an OEM4 dual-frequency GPS receiver (NovAtel, Calgary, Canada). Pseudorange, carrier phase and Doppler L1/L2 GPS measurements were recorded at 20 Hz to an Anticyclone Systems AntiLog RS232 Serial Data Logger (Anticyclone Systems Ltd, Godalming, Surrey, UK). For the GCP survey, the FlexPak was connected to an L1/L2 NovAtel GPS-702-GG antenna mounted on a Seco 2 m GPS Rover Rod surveying pole (SECO, Redding, CA, USA). GPS time stamps were recorded on a trigger press when the survey pole was vertical (determined by spirit level) so that position at that time (GCP position) could be determined later. While attached to the human subject, the system used an L1/L2 head-mounted 2775 ‘puck’ antenna (AeroAntenna Technology, Chatsworth, CA, USA). Ephemeris and reference 1 Hz L1/L2 GPS pseudorange, carrier phase and Doppler were recorded using a NovAtel SPAN-SE with an OEMV3 board installed. This was connected to another GPS-702-GG antenna mounted stationary on a rooftop less than 1000 m from the experimental area. Survey data were post-processed to determine 20 Hz position and velocity using GrafNav v.8.1 (NovAtel) using the ARTK algorithm for integer ambiguity resolution. This gave a horizontal position accuracy of 0.02 m root mean square (RMS) relative to the base station. Data from the UAS on-board GPS logger were likewise processed against the reference to determine 5 Hz position with horizontal accuracy of 0.1 m RMS.
Video footage was synchronised to survey GPS with an audio time stamp (see Appendix). The video data were digitised using the DLTdv5 data viewer (Hedrick, 2008) in MATLAB (MathWorks, Natick, MA, USA). Pixel coordinates for the head of the subject and the GCPs, in each frame of the video, were determined using the auto-tracking feature (extended Kalman predictor). A lens distortion model was applied to the data to remove the effect of lens distortion, using GoPro distortion coefficients (personal communication, Ty Hedrick, Hedrick Lab, The University of North Carolina; see Table S1). As the camera view was not fixed, each frame required individual calibration to convert position in the camera frame to the GPS frame. Three transformation types were trialled for calibration: affine, projective and second-order polynomial (MATLAB CP2TFORM function). To assess the effect of GCP count and position, the number of points was also varied from the transformation minimum to 25 (the full grid). Ten random combinations of each number of GCPs were then selected; there is the possibility for repetition of combinations (for 25 GCPs, each combination will be the same). For each combination, the least squares transformations necessary to give GCP positions in GPS coordinates from pixel coordinates were calculated, and these transformations were applied to the pixel coordinates of the subject. This was repeated for each frame to give the GPS coordinates of the subject, as well as derived velocity and acceleration, throughout each trial. The MATLAB function CP2TFORM was used to calculate transformations.
Position, velocity and acceleration of the subject were compared with those from the GPS survey, with RMS errors calculated over each trial. A one-way ANOVA was used to statistically test the differences between transformation types. The transformations used assume that the GCPs lie on a local-level plane; the subject must be on the same plane to correctly determine their horizontal position. This may be approximately true for the feet of the subject, but these are not always easily visible, so the head was chosen as the reference point here. As shown in Fig. 1, the projection of the head position onto the local-level plane will be offset from the true position. This will result in an error, which we will refer to here as ‘oblique projection error’, which increases with head height and unmanned aerial vehicle (UAV) observation angle from the vertical. To mitigate this, we also propose a simple correction, applied in both north and east directions (see Eqn 1). The corrected position was calculated using the projected subject position, measured subject height and the UAV external GPS logger position data.
Different numbers and combinations of GCPs were used to infer transformations to test the effect of number and position of GCPs on the accuracy of position, velocity and acceleration measurements and to identify an optimum number of GCPs for use in the field. The mathematical minimum numbers of GCPs to calculate transformations are three, four and six for affine, projective and second-order polynomial, respectively.
This work was approved by the Royal Veterinary College Ethics and Welfare Committee.
All trials were suitable for digitisation and implementation of above method (Fig. 2A).
Using all 25 GCPs, a projective transformation and pre-transformation using lens distortion parameters, the mean RMS errors and interquartile ranges (IQRs) in horizontal position, velocity and acceleration for the 28 trials were 0.45 m (IQR 0.13 m), 0.18 m s−1 (IQR 0.03 m s−1) and 2.97 m s−2 (IQR 0.81 m s−2), respectively (Fig. 3). Adding oblique projection correction gave a significant improvement in accuracy (P<0.05 in all cases) to 0.13 m position (IQR 0.04 m), 0.11 m s−1 velocity (IQR 0.02 m s−1) and 2.31 m s−2 acceleration (IQR 0.38 m s−2) (Figs 2–4). This is a 71.1% reduction in mean RMS position error, a 38.9% reduction in mean RMS velocity error and a 22.2% reduction in mean RMS acceleration error. The grand means of the RMS for GCP residuals were 0.15 m (IQR 0.09 m), 0.11 m (IQR 0.07 m) and 0.10 m (IQR 0.08 m) for affine, projective and polynomial projections, respectively; however, this dropped to 0.05 m (IQR 0.04 m), 0.04 m (IQR 0.02 m) and 0.03 m (IQR 0.01 m), respectively, with pre-transformation using known lens distortion parameters. Using a Wilcoxon rank sum test, a significant difference was demonstrated in position and velocity error with and without lens distortion parameters applied (P<0.05); however, there was no difference in acceleration error. The mean velocity and acceleration across all 28 trials was 1.66 m s−1 and 1.51 m s−2, respectively. There was no correlation between the magnitude of error and magnitude of absolute velocity or acceleration.
In all cases, an increase in number of GCPs resulted in higher accuracy. The minimum number of GCPs depended upon the transformation used and ranged between three and six. There was a substantial reduction in error above the minimum, with a rapid drop off in the effect, and generally, 5 GCPs above the minimum, there was limited and reducing benefit in adding further GCPs. The relationship between number of GCPs and accuracy is presented in Fig. 5. In general, the reduction in error from two points above the minimum to 10 points above the minimum is 0.5–10% of the 25 GCPs error (lowest error). For instance, velocity error using an affine transformation is 2.4% greater than 25 GCPs at 2 GCPs above the minimum and 0.4% higher than 25 GCPs at 10 GCPs above the minimum, a change of 2%.
This study has demonstrated that a UAS using GPS-surveyed GCPs can deliver position measurements and derived velocity and acceleration with an accuracy better than that specified for commercial GPS units. Commercial GPS receivers in standalone mode typically quote 1.5–4.8 m RMS horizontal position accuracy and this can be improved to 0.7 m (Witte and Wilson, 2004, 2005; also see http://www.novatel.com/assets/Documents/Papers/OEMStar.pdf) using a satellite-based augmentation system such as the Wide Area Augmentation System and the European Geostationary Navigation Overlay Service. Some receivers also have differential GPS capabilities giving accuracies of 0.5 m RMS horizontal position accuracy (see http://www.novatel.com/assets/Documents/Papers/OEMStar.pdf) relative to a base station. GPS precise point positioning (∼0.10 m accuracy) and carrier phase integer ambiguity methods (∼0.02 m accuracy) are more accurate than the data and method proposed in this paper, but they are susceptible to high dynamics, require continuous data and the modules are expensive and power hungry, and so have rarely been used on moving animals (Williams et al., 2009). The accuracy of the data collected in this study exceeded that from most systems, including GPS-IMU fusion, by 75% in position and 62% in velocity (0.17 m compared with 0.67 m horizontal position error, and 0.13 m s−1 compared with 0.34 m s−1 velocity error) (Wilson et al., 2013).
All three transformation types showed similar minimum errors and GCP residuals. Although the benefit to accuracy using a projective over an affine transformation is small here, we would expect significant performance benefits of a projective transformation with an obliquely oriented camera because it corrects for perspective convergence. Although there was no obvious difference using the second-order polynomial transformation here, there may be a benefit when using other lens modes that result in higher levels of distortion. In that situation, however, the best results would be obtained by pre-transformation of the data by application of a lens distortion model, as the number of GCPs needed to mitigate this would be impractical.
Although there was little difference in the minimum error achieved with different transformation types, there were differences in the number of GCPs needed to calculate transformations (minimum GCPs 3, 4 and 6 for affine, projective and second-order polynomial, respectively) and to achieve acceptable levels of error. At 8 GCPs, additional GCPs provided little further reduction in mean RMS horizontal position error for projective and affine transformation (3% and 2% difference, respectively, from their 25 GCP transformation value), whereas a second-order polynomial transformation reached a similar error level at 12 GCPs. Compared with the most accurate value achieved for RMS horizontal position in this study (25 GCP projective transformation, 0.17 m), at 8 GCPs a projective transformation is approximately 3% greater than this (see above). The same difference occurs at 13 GCPs for a second-order polynomial and does not occur at all for an affine transformation (5% difference at 25 GCPS). From these results we therefore recommend using eight or more GCPs and a projective transformation. Redundancy in GCPs is recommended for field use to ensure there are sufficient GCPs should one or more inadvertently leave the field of view and give more robustness where there is uneven terrain. If it is not possible to survey GCP positions with the accuracy achieved in these experiments, additional GCPs are also likely to give a transformation with a better fit and therefore improve measurement of position.
Ten random combinations of each number of GCPs were tested; this allowed some insight into the effect of the position of the GCPs on the accuracy of measurements. We would expect that if the position of the GCPs were to have a large effect on the accuracy of the measurements, then there would be a large variance in the RMS errors with different geometric configurations of GCPs for each trial. Furthermore, we would expect error to increase the further a point is from the GCP cluster (or the further a point is outside the bounds of the GCP grid). The data showed little variance in error over the 10 random samples (above 8 GCPs) and no evident positive correlation between error and distance from the GCP cluster. Although accuracy of the solution does not appear to be particularly sensitive to GCP formation, there must be some effect because extreme cases of collinear or tightly clustered GCPs would either not allow calculation of a transformation or at best result in large residuals. The GCPs tested here, although randomly selected, were laid in a large grid. We would therefore recommend spreading the GCPs out evenly throughout the field of view as much as is possible and/or practical.
The accuracy of the GPS survey of GCP positions is crucial to the results of this study because of its role in calculating the transformation from pixel to GPS coordinate systems. Degradation of the accuracy of derived measurements should be expected when using less accurate GPS survey equipment. GPS position of the UAV is used to correct ‘oblique projection’ and the accuracy of these GPS measurements affect overall system accuracy, though the influence of this is reduced as the lateral offset and subject height decreases or UAV altitude increases (enabled by a higher resolution camera).
There is error contribution from digitisation (auto-tracking). When auto-tracking the GCP markers there is variation in the exact pixel point of digitization within the bounds of the marker, giving potential error of ±0.13 m in the GCPs centre point position; this also applies to the use of auto-tracking on subject position. Error in synchronisation was negligible in this case, although latency in generation of the synchronisation signal and writing of the GPS time stamp has not been taken into account.
This study was undertaken on a relatively flat area of terrain. The method assumes that the subject is on the same local-level plane as the GCPs with UAV height measured relative to that plane. Vertical displacement of a subject due to terrain variation increases oblique perspective error in proportion to the offset of the UAV, i.e. zero error when it is directly overhead. For example, the position error associated with a 10 m ground level change, assuming zero subject height with the UAV at 40 m altitude and 10 m offset, would be 2 m in the axis of offset if the subject is 10 m below the plane and 3.3 m if 10 m above (see Appendix). This error could be mitigated with a ground surface model, using a large enough grid of GCPs or a 3D surface model generated by the UAS.
It is important that a lens distortion model is used and that pre-transformation of the data using the model is undertaken. In this study there was a 27% improvement in position error with the application of a lens distortion model, which is evidently a marked source of error if not taken into account.
The main limitation of the method is that it requires fixed GCPs (natural or placed) in the field of view prior to collecting video footage of the animal. Calibration can take place after collecting video footage; however, the GCPs must already be within this footage. The use of natural features as GCPs would make this method more flexible, as they could be identified from the aerial footage and then calibrated afterwards rather than being placed manually beforehand. There may be increased error in the GCP survey, as natural features, unlike placed markers, may not have an obvious reference point that can be accurately identified from video and during the subsequent survey.
UAV endurance is limited, with few being able to stay airborne for more than 15–20 min. Nevertheless, this is adequate to capture locomotor events such as hunting, startle responses and short periods of normal locomotion. Other potential drawbacks of such platforms include: small camera payload capacity (limited to GoPro or similar small high-resolution cameras), with typically wide-angled fish-eye lenses causing severe lens distortion, and the fact that camera gimbals do not typically contain encoders to give orientation relative to the platform. Care must be taken to avoid the UAS causing stress to the animals being tracked, which could affect behaviour (Vas et al., 2015).
One of the most exciting applications of this method is the ability to track herds or multiple individuals. This would enable the collection of data from both predator and prey species simultaneously, enabling investigations into predator–prey dynamics without the reliance on chance encounters between collared animals. The rapid deployment time and ability to calibrate the field of view after filming increases the chance of capturing hunting events. It also would permit the study of group or flock dynamics (King et al., 2012).
This paper presents a novel method of gathering position, velocity and acceleration measurements from freely locomoting wild animals. The performance outstrips that of consumer-grade GPS and matches the Royal Veterinary College's GPS-IMU collar measurements. However, it relies upon accurate GPS survey of markers in the field of view. Though there are limitations, this method is an exciting and low-cost solution for deriving locomotion data from multiple free-ranging animals simultaneously; it does not require specialist skills, and it is rapidly and easily deployed and transported to field locations. Therefore, it is a useful addition to the methods available for field data collection on free-ranging animal locomotion.
Thanks to Anna Wilson for manuscript corrections.
A custom ‘synchronisation box’ was designed which produced an audible beep from a small speaker and triggered a GPS time stamp in response to a button press. The speaker was positioned over the microphone of the GoPro Hero 3, resulting in obvious audio peaks from which the GPS time of each frame was interpolated. To achieve a recognizable audio signal, a series of seven beeps was generated at the beginning and end of the experiment to allow correction of linear clock drift and frame timing inaccuracy. Using a clapperboard in the camera frame revealed a consistent seven-frame offset between video and audio; this correction was also applied. Time synchronisation resolution was limited by the frame rate. The error in timing of the synchronisation beep will show a continuous uniform distribution. We can calculate the RMS of this distribution using the standard deviation of a continuous uniform distribution according to: (A 1)where ΔT is the interval, in this case the inter-frame time interval. At a frame rate of 60 frames s–1, we expect an RMS error of at least 0.005 s (see Eqn A1). The mean velocity during this experiment was 1.6 m s–1, equating to a synchronisation error in position terms of 0.01 m.
The perceived changed in subject position ΔPsub due to oblique projection error inclusive of terrain height is calculated as follows: (A 2)where Psub,true is the true subject horizontal position, Puav is UAV horizontal position, hterr is terrain height, hsub is subject height and huav is UAV height.
The authors declare no competing or financial interests.
R.J.H., A.M.W. and T.Y.H. developed the concepts, R.J.H., T.Y.H., A.M.W. and K.R. developed the approach, C.B., K.R., R.J.H. and H.K.E. performed the experiments, R.J.H. analysed the data, and all authors prepared and/or edited the manuscript.
This work is part of the LOCATE project, funded by the European Research Council (AD-G 323041), and the Biotechnology and Biological Sciences Research Council funded Cheetah Project (BB/J018007/1).
Supplementary information available online at http://jeb.biologists.org/lookup/doi/10.1242/jeb.139022.supplemental
- Received February 12, 2016.
- Accepted June 20, 2016.
- © 2016. Published by The Company of Biologists Ltd