Ecological Archives E082-008-S1

Michael Schaub, Roger Pradel, Lukas Jenni, and Jean-Dominique Lebreton. 2001. Migrating birds stop over longer than usually thought: an improved capture-recapture analysis. Ecology 82: 852-859.

Supplement

Software SODA for computation of parameter estimates and confidence intervals, user instructions and two example data files.

Editor's note: The authors have not provided source code for this software because of CNRS policies. All questions about this software must be directed to the authors.


Authors
File list
Description
Download files


Author
Michael Schaub
Swiss Ornithological Institute
CH-6204 Sempach
Switzerland
+41 414 629700 (ph.)
+41 414 629710 (fax)
michael.schaub@vogelwarte.ch
schaubm@orninst.ch


File List

soda.zip -- zipped files of the software SODA
example.zip -- the two example data files used in the paper


Description

User instructions for the program SODA: Estimation of stopover duration by bootstrapping.

Introduction
If the probabilities of arrival (immigration) of animals at the stopover site and their departure (emigration) from it are known, the mean stopover duration can be estimated with the formula given in the Appendix. However, because the covariance between the immigration and the emigration probabilities is neither zero nor known, it is not possible to calculate the precision of this estimate with an analytical formula. An alternative method to obtain this information is the bootstrap (Efron and Tibshirani 1993). It is a technique that is commonly used for the estimation of precision, and that is what this program does: it calculates mean stopover duration and its precision under given models for the immigration and the emigration processes.

To put it in simple words, the bootstrap works in the following way: you have a data set containing y individual capture histories and from these, a new data set of y capture histories is produced by selecting individual capture histories from the original data set at random. There are no restrictions on how often a particular capture history can be selected, so that some of them can be chosen several times. From this new data set, immigration, emigration and recapture probabilities are estimated under the specified model, and from these probabilities stopover duration is calculated. The result is called one resampling and gives one line in the output file (see below). This procedure is repeated as many times as one likes. From the resamplings the mean and its precision can be estimated. Obviously, as the number of resamplings increases, the precision of the mean increases as well, but will stabilize after a certain number of resamplings.

This instruction provides information on how the program is installed and used.

Finding appropriate models for the estimation of stopover duration
SODA does not allow model selection. This has to be done by programs that analyse capture-recapture data such as
SURGE or MARK (see http://canuck.dnr.cornell.edu/misc/cmr/). Thus, before you can start to estimate stopover duration, appropriate models for the immigration and the emigration processes need to be found.

Installation
The program has originally been written in MATLAB but is now compiled. CNRS policies prohibit the distribution of the source code, so only the excecutable is provided. To install the software, dowload the WinZip archive file SODA.zip, unpack it, and copy the 10 *.dll-files and the executable file soda.exe into a directory of your choice. The files inculded in this archive that are required for running SODA are:

soda.exe
Gui_sgl.dll
Hardcopy_sgl.dll
Hg_sgl.dll
Libmat.dll
Libmatlb.dll
Libmmfile.dll
Libmx.dll
Libut.dll
Sgl.dll
Uiw_sgl.dll

The data file
The data file is the capture histories of all birds considered and has the biomeco-format. The first line of this file contains two numbers that are separated by at least one blank. These numbers are the dimension of the capture history matrix, the first one is the number of the birds caught, the second one the number of capture occasions. The next two lines contain each a $. The third line contains the capture history of the first bird. The capture occasions must be separated by at least one blank. Below is an example of a very short datafile:

12 5
$
$
1 0 0 0 0
1 0 0 0 0
1 1 0 0 0
1 0 1 0 0
0 1 0 0 0
0 1 0 0 0
0 0 1 0 0
0 1 0 1 0
0 0 1 0 0
0 0 0 1 0
0 0 1 1 0
0 0 0 1 1

Such files can be created using any text-processing or spreadsheet program. Data files must be saved in ASCII (text) format (the file extension does not matter).

The model file
We implemented two predefined models ({f (t), p(t)}=’Cormack model’ and {f (t), p}, or {g (t), p(t)} and {g (t), p} respectively) into the program. If you want to run other models, you have to create model-files first.

If you know how to build constraining files in SURGE (‘var’-files) or in MARK, you will find this fairly easy to do, since the model files are constructed in the same way. Basically, the model file is a design matrix that constrains the elements in triangular matrices of the emigration (immigration) and recapture parameters in the way you want. The model file matrix contains 0’s and 1’s; the dimension of this matrix depends on the number of capture occasions and on the model you want to fit.

We assume that you are a experienced SURGE or MARK user (in fact you have to be, otherwise you would not know which models give reliable estimates of stopover duration), and that you know what the triangular matrices (named ‘parameter matrix’ in SURGE, and ‘PIM’ in MARK) of the estimated parameters mean. Now, imagine that you have a data set containing 4 capture occasions. This will produce two triangular matrices, one for the emigration (survival) or immigration (recruitment), respectively, and one for the recapture probabilities:

Emigration:

E11 E12 E13
  E22 E23
    E33

Recapture:

P11 P12 P13
  P22 P23
    P33

You may notice that each element of the matrices has its unique suffix, thus all of them can be estimated if you do not constrain them. To do this you have to use an identity matrix as model file, that is a matrix containing 1’s in the diagonal and 0’s at all other places. In this particular case this would be a 12x12 matrix. But most likely this is not the model you want to run, but instead you might want to run for example a time-dependent model {f (t), p(t)}. The triangular matrices should therefore look as follows:

Emigration:

e1 e2 e3
  e2 e3
    e3

Recapture:

p1 p2 p3
  p2 p3
    p3

Therefore you have to produce a matrix that constrains the original parameters Eij and Pij to become the parameters ej and pj. For example the elements E12 and E22 should be constrained to be equal. The constrain matrix (the model file) would have the elements of the original triangular matrices as row labels and the new elements as columns heading. This matrix has then to be filled by 0 and 1 in such a way that they constrain the elements in the way you want. Here is the case of the time-dependent model:

Parameter numbers e1 e2 e3 p1 p2 p3
Elements            
E11 1 0 0 0 0 0
E12 0 1 0 0 0 0
E13 0 0 1 0 0 0
E22 0 1 0 0 0 0
E23 0 0 1 0 0 0
E33 0 0 1 0 0 0
P11 0 0 0 1 0 0
P12 0 0 0 0 1 0
P13 0 0 0 0 0 1
P22 0 0 0 0 1 0
P23 0 0 0 0 0 1
P33 0 0 0 0 0 1

In this way, each row can be read as simple equation; it specifies how the element Eij is calculated from the ej and from the pj. For example, the element E12 is calculated as:

E12 = 0*e1 + 1*e2 + 0*e3 + 0*p1 + 0*p2 + 0*p3 = e2

Element E22 is calculated as:

E22 = 0*e1 + 1*e2 + 0*e3 + 0*p1 + 0*p2 + 0*p3 = e2

And thus E22 is equal to E12, just what you wanted.

The model file for the model {f (t), p(t)} you need in the program does not contain any row or column headings, but only looks like this:

1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 1

The columns have to be separated by at least one blank, and the file has to be saved in ASCII-format.

Now, let us consider another example: you might intend to fit the model {f , p(t)}. Again you have 4 capture occasions and we write the triangular matrices side by side:

Emigration:

Elements     Constrain  
E11 E12 E13   e1 e1 e1
  E22 E23     e1 e1
    E33       e1

 

 

Recapture:

Elements     Constrain  
P11 P12 P13   p1 p2 p3
  P22 P23     p2 p3
    P33       p3

The constrain matrix is:

Parameter numbers e1 p1 p2 p3
Elements        
E11 1 0 0 0
E12 1 0 0 0
E13 1 0 0 0
E22 1 0 0 0
E23 1 0 0 0
E33 1 0 0 0
P11 0 1 0 0
P12 0 0 1 0
P13 0 0 0 1
P22 0 0 1 0
P23 0 0 0 1
P33 0 0 0 1

It is obvious that if there are n capture occasions, each triangular matrix model has (n-1)*n/2 elements. The models for the emigration (immigration) process and for the recapture process must be written below each other, thus the whole model file has then (n-1)*n rows; (n-1)*n/2 rows for the f (or g ) model and (n-1)*n/2 rows for the p model.

Running the program
Once you have copied the program file, libraries, data files and model files into a desired directory, you are ready to start. First select
soda.exe with the navigator. A window appears which asks the user for a biomeco capture-recapture file (click on ‘Select the C-R-file’ button). The program exits if no file is selected. Then you have to enter the number of repetitions for the bootstrap and to select the two models for the estimation of survival and recruitment.

Once you have selected the desired models and output (see below), click on the button ‘Done’. If constrained files are needed, the program asks you for the name of these files. Then, the program starts the computation and a waitbar appears. At the end, you can save your results in a file.

Output files
The output file is a text file which can easily be opened in Excel (e.g.). In order to understand how the output is provided, imagine that you had n capture occasion and that you had chosen k resamplings. Further let us suppose that you have chosen the output file option details.

The output file consists of k rows, and each row is the result of one resampling, and of 8n-5 columns. They can be split in 8 classes which contain the following information:

If you chose the output option SOD only, you will get the first 3 columns classes only. The output option PGR only provides the last 5 column classes.

Presentation of the results
After saving the results, a new window appears which provides graphical and numerical results. First, select the variable you want to study. You can chose between (a) the numerical values (mean, standard error, minimum, maximum) of the selected variable for each occasion, or (b) a histogram for a selected occasion with the number of bootstrap results against the value of the selected variable.

Interpretation and significance of the results
From the saved output file you can easily calculate the mean of the total stopover duration and investigate, how precise the mean is estimated. To do this, you just calculate the mean of the column of interest. To get a confidence interval for the estimate of the mean, you sort the values in the columns ascendently. Given that you have run k resamplings and you want to calculate the t-%confidence interval, then the value numbers

are the lower, and the upper boundaries, respectively, of the confidence interval.

You might wonder why n stopover durations are calculated although only n-1 immigration or emigration probabilities could be estimated. This is possible, because we defined the immigration and emigration probabilities that are beyond the study period as well. For these probabilities we opted in all cases for a weighed gliding average involving the last three estimable probabilities. For the emigration probabilities, for instance, we calculated the f 's beyond the study period as (n capture occasion):

The immigration probabilities beyond the study period are calculated in the same way. If you use time-dependent models for estimating stopover duration, the estimates at the beginning and the end of the study period are probably less reliable, because they are most influenced by these gliding averages.

In order to clarify which values are provided by the output, look at the next figure. In this example a datafile containing n capture occasions (t1 ... tn) was analyzed. This figure also shows to which point of time the estimates refer. For example, the recapture probabilities and the stopover durations refer to the time of capture, whereas the emigration (immigration) probabilities refer to the time between two capture occasions.

Citation
If you use the program soda and want to give reference, please refer to Schaub et al. (2001).

References
Efron, B., and R. J. Tibshirani. 1993. An introduction to the bootstrap. Monographs on Statistics and Applied Probability, No. 57. Chapman and Hall, London, 436 p.
Pradel, R. 1996. Utilization of capture-mark-recapture for the study of recruitment and population growth rate. Biometrics 52: 703-709.
Schaub, M., Pradel, R., Jenni, L., and Lebreton, J.-D. 2001. Migrating birds stop over longer than usually thought: an improved capture-recapture analysis. Ecology 82: 852-859.


[Back to E082-008]