Appendix A. Additional details on the computational implementation of data pre-processing, the signal processing framework, error calculation, and the assessment of filter performance; discussion of biological insights gained from evaluation of the model structure; figures depicting both methods and results as referenced in the main document; and two tables containing empirical results of filter performance across individuals and models.
Computational implementation of data pre-processing and filtering
Additional description of input data processing
A simple lookup table provided the distances from rivers, roads and park boundaries at any point as a single scalar value. For prediction purposes we used only the elephant's absolute distance from any of these features. Elevation was used only to exclude regions effectively inaccessible to elephants, based on previous work demonstrating the study elephants avoid high angle slopes (Wall et al. 2006). Specifically, we modified our NDVI map to exclude data from all regions with greater than 20° slope, assuming the availability of resources and distance to other landscape features in these regions had no effect on elephant behavior.
Assuming elephant do not have ubiquitous knowledge of ecological conditions in their environs, we only included NDVI values within a 6 kilometer radius of an elephant’s current position in our model. In order to remove data gaps (caused by missing data due to cloud cover or excluded in relation to slope) and limit processing requirements, these data were averaged in nine sub-sampled sectors defined as the area within a 1 km radius of the elephant’s position (i.e. the local conditions) and 8 equal radial sectors out to the 6 km radius boundary (i.e. the conditions in possible movement directions; Fig. A2). Local vegetation scores compiled as the NDVI average values in the nine sectors were then included in the model as a nine dimensional vector.
Signal processing framework extended
In this section we explain in more detail the filtering prediction approach used. This is an application of a general technique for predicting the future behavior of an unknown stationary stochastic process from the statistical information in the signal. The basic theory for the discrete filter approach was laid down simultaneously by famous mathematicians Norbert Wiener and Andrey Kolmorgov in the 1940s. A more detailed introduction to Weiner filtering and its traditional applications can be found in most advanced texts on signal processing, Hayes 1996 provides an excellent treatment with the same notational conventions we employ. Let d(t) be the real movement behavior. Let s(t) be the signal input we can measure. Define the linear estimate of the values time f in the future, of d(t+f) as the estimate d*(t). We are looking for an optimal linear filter that will compute this value:
Here L[-] denotes the linear estimator. By convention, the estimated quantity (the stochastic variable d) is listed before the | symbol and the given quantities and constraints for the estimation (the stochastic variable s at all previous times) are listed after. If we insist that the filter is shift invariant, (depends only on the time of the observation s relative to the time at which we want to predict, rather than having explicit time dependence), the linear estimator can be written as a convolution:
The task is to choose the function w(t) such that the expected (squared) difference between the prediction we make about the future, d*(t), and the actual movement characteristics in the future d(t+f), is minimized. We emphasize expected difference since both d and d* are random variables.
Let e(t) = d*(t)– d(t+f) be our error at time t. We want to minimize the mean squared error, E[e(t)2] (where ), the expectation value of the random variable x defined by the probability density function p(x)). This is achieved when the error, e(t) is orthogonal to the observations s(t). A w(t) for which the error is still correlated with the input signals suggests that some of that error signal actually contains information we could be using in our prediction. The function w(t) which satisfies this orthogonality criteria is called a Weiner filter. Mathematically we can write the orthogonality condition as,
E[e(t)s(t)] = 0 (A.3)
substituting in for e(t)
E[(d*(t) - d(t + f))s(t)] = 0 (A.4)
and substituting Eq. A.2 into this
The expectation operator is linear and we can distribute it over the arguments and change the order with the integration, resulting in
E[s(t - u)s(t)] is just the auto correlation in s, rs(t - u), and E[d(t + f)s(t)] is the cross correlation rds(t + f). So we can rewrite Eq. A.6 as:
Equations of this form are called Weiner-Hopf equations. rds and rs can be computed from the statistics of the measurements s, and the movements, d, and therefore Eq. A.7 can be solved for the optimal w. The solution approach for this continuous time predictive Weiner filter can be found in many standard texts on statistical signal processing, (eg. Hayes 1996).
Since we have real data sampled at discrete intervals the convolutions can be written as sums. If we restrict the filter to only using signals from time t - p until now, (i.e. p samples ago rather than from – infinity ago), these become finite sums and can be written as matrix products.
Solving the discrete Wiener-Hopf equations (A.8) for w defines the optimal linear filter we are seeking. There are several ways this can be done, for computational efficiency we rephrase Eq. A.8 as a matrix equation which can then be solved by matrix inversion.
The derivation also readily generalizes to multidimensional signal inputs, s[n] and multidimensional data d[n], where the components of d[n] (e.g. x-centroid of movement, angle of movement) are indexed by j and the components of the signal (e.g. vegetation, distance from water, previous movement) are indexed by i.
To solve the discrete, multidimensional Weiner-Hopf equations in (A.9), we rewrite this equation in matrix form. Eq. A.9 is the same as Eq. 1 in the main text. Choosing p = 5, for reasons elaborated below, and N = 17 (using all available signals) gave the best results.
Let W* be a pN × 4 with 4 columns for each of the 4 movement characteristics j, defined as:
for columns j = 1:4. Note the time dependence of W[n] has been re-arranged into the explicit elements of the larger matrix W*, whose components are not time dependent variables. Let Rs be a pN × pN matrix, which is an N × N array of p × p sub matrices for the autocorrelations and cross correlations of the signals evaluated for separations of up to p. Finally, let Rds be a pN × 4 matrix where the columns again correspond to each of the movement characteristics, j, as for W*, and each p entries correspond to the discretely sampled cross correlation of characteristic j with input signal i = 1:N. Each of the j columns is therefore:
for columns j = 1:4. Then the multidimensional sum in equation (A.9) can be readily written as the matrix product, RsW* = Rds, from which W* is readily solved by inverting Rs and multiplying through by that result:
W* = Rs-1Rds (A.12)
From which the coefficients of W[k] can be read out following Eq. A.10.
Error calculation details extended
Filter performance was assessed through the calculation of the mean squared error (MSE) of the position of predicted movements relative to actual movements. Because signal power varied over time, MSE was normalized by the root power of the signal (i.e. standard deviation). As a result of these normalization processes, MSE units are not directly translatable to a straight metric of spatial accuracy (i.e. the difference in meters of predicted from actual movements). Therefore, comparisons represent relative differences in performance. In order to provide a conceptual understanding of filter performance, simplified representation of predictive movement error is presented in the form of the actual spatial error (km) of the CoM. In order to do this, CoM error is calculated independently from the other movement parameters and presented alongside the normalized error for the overall filter in Table 1. Comparison of representative predictions and actual movements that correspond to the median MSE are shown in Fig. 1.
Additional information regarding assessment of filter performance
Comparison within and across elephants was restricted in relation to absolute time, season, and/or location. Restrictions in relation to time were straightforward. Restrictions in relation to season or NDVI were conducted simply by requiring the NDVI value for the starting point of the elephant whose movement we aimed to predict with a different filter was within 20% of the relative range of the NDVI value for the starting point of the dataset from which the filter was constructed (e.g. if the NDVI data is normalized from 0 to 1, the difference in the NDVI values could be no more than 0.2). For the location-based restriction, the distance between the two starting data points had to be within 20% of the total latitude and longitudinal range visited by any of the 13 elephants in our data set. Differences in data sets across the elephants limited the application of filters across elephants at the same time, with samples sizes of a 'same time different elephant' analysis varying widely across pairs with multiple pairs lacking any comparison. Therefore, results of such an analysis were not presented. In addition, certain similarity constraints between the datasets for a given pair of elephants had little or no overlap. If the amount of overlapping data was less than 25% of the median number of overlapping data sets for that set of criteria, the results were excluded as not statistically robust.
To test for possible relationships between local landscape features and locations where filter predictions are generally poor, we divided the study region into 100 × 100 meter grid sections. Within each section, we computed the number times an elephant was present and a motion prediction was made, and the number of times the prediction had significant error. The number of errors was normalized by the number of total predictions made in each section, and recorded as the error density. Results were plotted in a linear colormap where red indicates the highest error rate observed, blue the lowest, and black indicates no data. The choice of the cutoff for ‘significant error’ was defined as MSE of 1.5, a level which exceeds the performance of a random walk but nonetheless shows considerable deviation from the average prediction using all environmental inputs. Different error cutoffs affected the data density and statistical confidences in the resulting error map, rather than the spatial pattern of errors (see Fig. A8 for error map with error as greater than MSE 2.7).
Insights from Model Structure
Our use of the Wiener filter illustrates how scales of analysis can be bridged (a focus of some research, e.g. Fryxell et al. 2008) by taking data that has already been analyzed in the context of diurnal autocorrelation patterns recognized to dominate variance in hourly movements (Wittemyer et al 2007) and compressing it into a daily representation. This was accomplished not simply as a single average location, but through the device of a bounding ellipse characterized by four parameters. As such, the computational burden of using the raw data itself was reduced, but information about the directionality and dispersion of the daily movement was retained. Similarly, our reduction of NDVI information using an inter-spoke and wheel approach also reduced the computational burden of our approach. These preprocessing steps thereby allowed us to focus on larger scale movement properties and provided comprehensive assessment of the performance of our models in several different contexts (e.g. across time and space, and among individuals).
The temporal efficiency of filter fitting and prediction was compared across input and predictive data scales of four orders of magnitude (from 10s tens to 10000s of hours; Fig. A3). Optimal performance occurred at around 6 weeks (500 hours for preprocessing and 500 hours for prediction). Filter fits at shorter time intervals did not perform well, while non-stationarity in longer-term datasets degraded longer period performance. Short-term shifts in foraging/land-use strategies potentially drive inaccurate estimates of correlation functions at this temporal resolution, as occurred when an elephant went in and out of the protected areas (Fig. 4). Similarly, non-stationary movement behavior at longer time scales (2+ months) reduced predictability, probably in relation to ecological changes in the biannual rainy season system (dry to wet transitions are graded over 3 month periods) and resulting range shifts (Wittemyer 2008). As such, this approach provided reasonably good insights regarding the correlates of movement at monthly time scales, but the implementation of this method to predict and understand covariates driving movement across seasonal or annual transitions appears limited.
FIG. A1. GPS data were preprocessed to reduce the dimensionality of the movement input data in the Weiner filter model. All GPS points recorded during each 24 hour period were simplified into a four parameter characterization of the daily movement consisting of the x and y coordinates of the center of mass, and the length and angle of the long axis of the minimum ellipse capturing all data in the 24 hour periods.
FIG. A2. (A) Normalized Differential Vegetation Data (NDVI) were preprocessed to reduce the dimensionality of the vegetation dynamic input data in the Weiner filter model. (B) For each 24 hour moving window, NDVI within a 6 km radius of the center of mass was characterized as a nine parameter vector, g. (C) Each component of g represents the average value within a 1 km radius centroid and 8 equal sized sectors radiating from the centroid to the bounding circle perimeter.
FIG. A3. Filter performance varied in relation to the amount of data input in the model. Optimal performance for the filter containing all signal inputs occurred using 1000 hours of data. The optimal amount of data varied slightly across specific input signals.
FIG. A4. Relative contributions, ri, of past movement history, NDVI, and static features to model predictions for Anastasia. Computed as:
Note on a short time scale movement predictions are frequently dominated by one or two signals, and the others hold little predictive information. A similar degree of variation in predictive values is seen in the data for the other elephants.
FIG. A5. The influence of dynamic vegetation productivity (as recorded by remotely sensed NDVI) was a key signal input in the Wiener filter. A histogram for each of the 13 elephants and all in combination (last panel) shows the relative frequency elephant movements remained in the same sector (left column) or moved to areas with differing relative NDVI values (second column to left is the sector with the highest NDVI progressing to the sector with the lowest NDVI to the right). The 8 sectors and NDVI ranking scheme are indicated in the inset, bottom right. The relationship between elephant movements and NDVI was dynamic, and showed marked differences across individuals. Some individuals demonstrated directed movements to higher NDVI while others show no in relation to NDVI.
FIG. A6. Predictive performance of filters applied randomly or with time, season, and spatial constraints rather than applied to consecutive 500 hour period. Filter performance was greater when applied to the elephant from which it originated, with performance decreasing by ~30% when applied to a different elephant. Constraining the filter such that it was applied to data collected within 3 months (Time > 3), at relatively similar locations (Nearby: within 0.2 of normalized distance values) and under similar seasonal conditions (Season: > 0.2 on normalized NDVI scale) increased performance.
FIG. A7. Spatial error map with cut off equal to MSE of 2.7, a threshold value approaching the average performance of the null model. Results are qualitatively similar to those shown in the main body indicating the location and density of error was not a function of the threshold chosen in this analysis.
FIG. A8. Spatial error map of each individual elephant for which the combination is presented in Fig. 4. Different individuals demonstrates stark differences in ranging patterns and, in relation, the location of predictive error. Individuals which were found almost exclusively within protected areas typically demonstrate the greatest error within the PA boundaries (e.g. Aztec and Maua), though Goya shows the opposite relationship. Most other individuals demonstrate higher error outside PA boundaries, the specific location of which appeared to relate to the overlap between human activity and their home range.
TABLE A1. The (normalized mean square error (MSE) of movement predictions across the 13 elephants were qualitatively similar. While elephants were tracked for different durations, tracking data covered different years and season, and movements were focused on different areas in the ecosystem, the best movement prediction (lowest MSE) was derived from the combination of all inputs regardless of elephant. Predictions using any signal input or combination of signal inputs exceeded those from the null model by an order of magnitude.
TABLE A2. Numerical values corresponding to Fig. A6, showing the predictive performance of filters applied randomly or with time, season, and spatial constraints rather than applied to consecutive 500 hour period. Here ‘self’ indicates that only filters made with data from elephant E are were used in predicting motion of elephant E.
|Constraints||Self Median (IQR)||Across Elephant Median (IQR)|
|No restrictions||2.78 (1.52, 5.14)||3.53 (2.15, 6.69)|
|Time > 3||2.98 (1.51, 5.42)||3.69 (2.19, 6.69)|
|Time < 3||1.82 (1.41, 6.12)||3.22 (1.50, 5.18)|
|Season||2.26 (1.03, 5.42)||2.81 (1.86, 6.00)|
|Nearby (< 20%)||1.56 (1.03, 2.99)||2.67 (1.42, 3.50)|
|Season, Position||1.39 (0.92, 2.02)||2.28 (1.32, 3.18)|
|Time > 3, Season, Position||1.42 (0.92, 3.04)||2.23 (1.33, 3.01)|
|Time < 3, Season||1.80 (1.03, 5.39)||3.09 (1.43, 5.07)|
|Time < 3, Position||1.57 (0.80, 3.42)||2.35 (1.27, 3.26)|
|Time < 3, Season, Position||1.54 (0.71, 3.88)||1.69 (1.08, 2.42)|
|Null Model||3.51 (2.98, 5.71)|
Hayes, M. H. 1996. Statistical digital signal processing and modeling. Wiley, New York.
Fryxell, J. M., M. Hazell, L. Borger, B. D. Dalziel, D. T. Haydon, J. M. Morales, T. McIntosh, and R. C. Rosatte. 2008. Multiple movement modes by large herbivores at multiple spatiotemporal scales. Proceedings of the National Academy of Sciences of the United States of America 105:19114–19119.
Stoffer, D.S. and Shumway, R. H. Time Series Analysis and Its Applications. Springer, 2000.
Wall, J., Douglas-Hamilton, I., and Vollrath, F. Elephants avoid costly mountaineering. Current Biology, 16:R527–R529, 2006.
Wittemyer, G., W. M. Getz, F. Vollrath, and I. Douglas-Hamilton. 2007. Social dominance, seasonal movements, and spatial segregation in African elephants: a contribution to conservation behavior. Behavioral Ecology and Sociobiology 61:1919–1931.
Wittemyer, G., L. Polansky, I. Douglas-Hamilton, and W. M. Getz. 2008. Disentangling the effects of forage, social rank, and risk on movement autocorrelation of elephants using Fourier and wavelet analyses. Proceedings of the National Academy of Sciences of the United States of America 105:19108–19113.