Michael Schaub, Roger Pradel, Lukas Jenni, and Jean-Dominique Lebreton. 2001. Migrating birds stop over longer than usually thought: an improved capture-recapture analysis. Ecology 82: 852-859.
Software SODA for computation of parameter estimates and confidence intervals, user instructions and two example data files.
Editor's note: The authors have not provided source code for this software because of CNRS policies. All questions about this software must be directed to the authors.
Authors
File list
Description
Download files
Author
Michael Schaub
Swiss Ornithological Institute
CH-6204 Sempach
Switzerland
+41 414 629700 (ph.)
+41 414 629710 (fax)
michael.schaub@vogelwarte.ch
schaubm@orninst.ch
soda.zip -- zipped files of the software SODA
example.zip -- the two example data files used in the paper
User instructions for the program SODA: Estimation of stopover duration by bootstrapping.
Introduction
If the probabilities of arrival (immigration) of animals at the stopover
site and their departure (emigration) from it are known, the mean stopover duration
can be estimated with the formula given in the Appendix.
However, because the covariance between the immigration and the emigration probabilities
is neither zero nor known, it is not possible to calculate the precision of
this estimate with an analytical formula. An alternative method to obtain this
information is the bootstrap (Efron and Tibshirani 1993). It is a technique
that is commonly used for the estimation of precision, and that is what this
program does: it calculates mean stopover duration and its precision under given
models for the immigration and the emigration processes.
To put it in simple words, the bootstrap works in the following way: you have a data set containing y individual capture histories and from these, a new data set of y capture histories is produced by selecting individual capture histories from the original data set at random. There are no restrictions on how often a particular capture history can be selected, so that some of them can be chosen several times. From this new data set, immigration, emigration and recapture probabilities are estimated under the specified model, and from these probabilities stopover duration is calculated. The result is called one resampling and gives one line in the output file (see below). This procedure is repeated as many times as one likes. From the resamplings the mean and its precision can be estimated. Obviously, as the number of resamplings increases, the precision of the mean increases as well, but will stabilize after a certain number of resamplings.
This instruction provides information on how the program is installed and used.
Finding appropriate models for
the estimation of stopover duration
SODA does not allow model selection. This has to be done by programs that
analyse capture-recapture data such as SURGE
or MARK (see
http://canuck.dnr.cornell.edu/misc/cmr/).
Thus, before you can start to estimate stopover duration, appropriate models
for the immigration and the emigration processes need to be found.
Installation
The program has originally
been written in MATLAB but
is now compiled. CNRS policies prohibit the distribution of the source code,
so only the excecutable is provided. To install the software, dowload the WinZip
archive file SODA.zip,
unpack it, and copy the 10 *.dll-files
and the executable file soda.exe into
a directory of your choice. The files inculded in this archive that are required
for running SODA are:
soda.exe
Gui_sgl.dll
Hardcopy_sgl.dll
Hg_sgl.dll
Libmat.dll
Libmatlb.dll
Libmmfile.dll
Libmx.dll
Libut.dll
Sgl.dll
Uiw_sgl.dll
The data file
The data file is the capture
histories of all birds considered and has the biomeco-format. The first line
of this file contains two numbers that are separated by at least one blank.
These numbers are the dimension of the capture history matrix, the first one
is the number of the birds caught, the second one the number of capture occasions.
The next two lines contain each a $. The third line contains the capture history
of the first bird. The capture occasions must be separated by at least one blank.
Below is an example of a very short datafile:
12 5
$
$
1 0 0 0 0
1 0 0 0
0
1 1 0 0 0
1 0 1 0 0
0
1 0 0 0
0 1 0 0 0
0 0 1 0 0
0 1 0 1 0
0 0 1 0 0
0
0 0 1 0
0 0 1 1 0
0 0 0 1 1
Such files can be created using any text-processing or spreadsheet program. Data files must be saved in ASCII (text) format (the file extension does not matter).
The model file
We implemented two predefined models ({f (t), p(t)}=Cormack model
and {f (t), p}, or {g (t), p(t)} and {g (t), p} respectively) into the program.
If you want to run other models, you have to create model-files first.
If you know how to build constraining files in SURGE (var-files) or in MARK, you will find this fairly easy to do, since the model files are constructed in the same way. Basically, the model file is a design matrix that constrains the elements in triangular matrices of the emigration (immigration) and recapture parameters in the way you want. The model file matrix contains 0s and 1s; the dimension of this matrix depends on the number of capture occasions and on the model you want to fit.
We assume that you are a experienced SURGE or MARK user (in fact you have to be, otherwise you would not know which models give reliable estimates of stopover duration), and that you know what the triangular matrices (named parameter matrix in SURGE, and PIM in MARK) of the estimated parameters mean. Now, imagine that you have a data set containing 4 capture occasions. This will produce two triangular matrices, one for the emigration (survival) or immigration (recruitment), respectively, and one for the recapture probabilities:
Emigration:
| E11 | E12 | E13 |
| E22 | E23 | |
| E33 |
Recapture:
| P11 | P12 | P13 |
| P22 | P23 | |
| P33 |
You may notice that each element of the matrices has its unique suffix, thus all of them can be estimated if you do not constrain them. To do this you have to use an identity matrix as model file, that is a matrix containing 1s in the diagonal and 0s at all other places. In this particular case this would be a 12x12 matrix. But most likely this is not the model you want to run, but instead you might want to run for example a time-dependent model {f (t), p(t)}. The triangular matrices should therefore look as follows:
Emigration:
| e1 | e2 | e3 |
| e2 | e3 | |
| e3 |
Recapture:
| p1 | p2 | p3 |
| p2 | p3 | |
| p3 |
Therefore you have to produce a matrix that constrains the original parameters Eij and Pij to become the parameters ej and pj. For example the elements E12 and E22 should be constrained to be equal. The constrain matrix (the model file) would have the elements of the original triangular matrices as row labels and the new elements as columns heading. This matrix has then to be filled by 0 and 1 in such a way that they constrain the elements in the way you want. Here is the case of the time-dependent model:
| Parameter numbers | e1 | e2 | e3 | p1 | p2 | p3 |
| Elements | ||||||
| E11 | 1 | 0 | 0 | 0 | 0 | 0 |
| E12 | 0 | 1 | 0 | 0 | 0 | 0 |
| E13 | 0 | 0 | 1 | 0 | 0 | 0 |
| E22 | 0 | 1 | 0 | 0 | 0 | 0 |
| E23 | 0 | 0 | 1 | 0 | 0 | 0 |
| E33 | 0 | 0 | 1 | 0 | 0 | 0 |
| P11 | 0 | 0 | 0 | 1 | 0 | 0 |
| P12 | 0 | 0 | 0 | 0 | 1 | 0 |
| P13 | 0 | 0 | 0 | 0 | 0 | 1 |
| P22 | 0 | 0 | 0 | 0 | 1 | 0 |
| P23 | 0 | 0 | 0 | 0 | 0 | 1 |
| P33 | 0 | 0 | 0 | 0 | 0 | 1 |
In this way, each row can be read as simple equation; it specifies how the element Eij is calculated from the ej and from the pj. For example, the element E12 is calculated as:
E12 = 0*e1 + 1*e2 + 0*e3 + 0*p1 + 0*p2 + 0*p3 = e2
Element E22 is calculated as:
E22 = 0*e1 + 1*e2 + 0*e3 + 0*p1 + 0*p2 + 0*p3 = e2
And thus E22 is equal to E12, just what you wanted.
The model file for the model {f (t), p(t)} you need in the program does not contain any row or column headings, but only looks like this:
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 1
The columns have to be separated by at least one blank, and the file has to be saved in ASCII-format.
Now, let us consider another example: you might intend to fit the model {f , p(t)}. Again you have 4 capture occasions and we write the triangular matrices side by side:
Emigration:
| Elements | Constrain | |||||
| E11 | E12 | E13 | e1 | e1 | e1 | |
| E22 | E23 | e1 | e1 | |||
| E33 | e1 | |||||
Recapture:
| Elements | Constrain | |||||
| P11 | P12 | P13 | p1 | p2 | p3 | |
| P22 | P23 | p2 | p3 | |||
| P33 | p3 | |||||
The constrain matrix is:
| Parameter numbers | e1 | p1 | p2 | p3 |
| Elements | ||||
| E11 | 1 | 0 | 0 | 0 |
| E12 | 1 | 0 | 0 | 0 |
| E13 | 1 | 0 | 0 | 0 |
| E22 | 1 | 0 | 0 | 0 |
| E23 | 1 | 0 | 0 | 0 |
| E33 | 1 | 0 | 0 | 0 |
| P11 | 0 | 1 | 0 | 0 |
| P12 | 0 | 0 | 1 | 0 |
| P13 | 0 | 0 | 0 | 1 |
| P22 | 0 | 0 | 1 | 0 |
| P23 | 0 | 0 | 0 | 1 |
| P33 | 0 | 0 | 0 | 1 |
It is obvious that if there are n capture occasions, each triangular matrix model has (n-1)*n/2 elements. The models for the emigration (immigration) process and for the recapture process must be written below each other, thus the whole model file has then (n-1)*n rows; (n-1)*n/2 rows for the f (or g ) model and (n-1)*n/2 rows for the p model.
Running the program
Once you have copied the program file, libraries, data files and model files
into a desired directory, you are ready to start. First select soda.exe with
the navigator. A window appears which asks the user for a biomeco capture-recapture
file (click on Select the C-R-file button). The program exits if
no file is selected. Then you have to enter the number of repetitions for the
bootstrap and to select the two models for the estimation of survival and recruitment.
Once you have selected the desired models and output (see below), click on the button Done. If constrained files are needed, the program asks you for the name of these files. Then, the program starts the computation and a waitbar appears. At the end, you can save your results in a file.
Output files
The output file is a text file which can easily be opened in Excel (e.g.). In
order to understand how the output is provided, imagine that you had n
capture occasion and that you had chosen k resamplings. Further let us
suppose that you have chosen the output file option details.
The output file consists of k rows, and each row is the result of one resampling, and of 8n-5 columns. They can be split in 8 classes which contain the following information:
If you chose the output option SOD only, you will get the first 3 columns classes only. The output option PGR only provides the last 5 column classes.
Presentation of the results
After saving the results, a new window appears which provides graphical and
numerical results. First, select the variable you want to study. You can chose
between (a) the numerical values (mean, standard error, minimum, maximum) of
the selected variable for each occasion, or (b) a histogram for a selected occasion
with the number of bootstrap results against the value of the selected variable.
Interpretation and significance
of the results
From the saved output file you can easily calculate the mean of the total stopover
duration and investigate, how precise the mean is estimated. To do this, you
just calculate the mean of the column of interest. To get a confidence interval
for the estimate of the mean, you sort the values in the columns ascendently.
Given that you have run k resamplings and you want to calculate the t-%confidence
interval, then the value numbers


are the lower, and the upper boundaries, respectively, of the confidence interval.
You might wonder why n stopover durations are calculated although only n-1 immigration or emigration probabilities could be estimated. This is possible, because we defined the immigration and emigration probabilities that are beyond the study period as well. For these probabilities we opted in all cases for a weighed gliding average involving the last three estimable probabilities. For the emigration probabilities, for instance, we calculated the f 's beyond the study period as (n capture occasion):

The immigration probabilities beyond the study period are calculated in the same way. If you use time-dependent models for estimating stopover duration, the estimates at the beginning and the end of the study period are probably less reliable, because they are most influenced by these gliding averages.
In order to clarify which values are provided by the output, look at the next figure. In this example a datafile containing n capture occasions (t1 ... tn) was analyzed. This figure also shows to which point of time the estimates refer. For example, the recapture probabilities and the stopover durations refer to the time of capture, whereas the emigration (immigration) probabilities refer to the time between two capture occasions.

Citation
If you use the program soda and want to give reference, please refer to Schaub
et al. (2001).
References
Efron, B., and R. J. Tibshirani. 1993. An introduction to the bootstrap.
Monographs on Statistics and Applied Probability, No. 57. Chapman and Hall,
London, 436 p.
Pradel, R. 1996. Utilization of capture-mark-recapture for the study of recruitment
and population growth rate. Biometrics 52: 703-709.
Schaub, M., Pradel, R., Jenni, L., and Lebreton, J.-D. 2001. Migrating birds
stop over longer than usually thought: an improved capture-recapture analysis.
Ecology 82: 852-859.