Valério D. Pillar. 1999. How sharp are classifications? Ecology 80:2508-2516.


Supplement

Supplement 1: Software for testing classification sharpness combined with sampling sufficiency evaluation.
Ecological Archives E080-014-S1.

Author
File list
Description
Download files

Erratum (21 September 2000)

Copyright


Author

Valério De Patta Pillar
Departamento de Ecologia
Universidade Federal do Rio Grande do Sul
Porto Alegre, RS, 91540-000, Brazil
E-mail: vpillar@ecologia.ufrgs.br
Fax: +55 51 3191568


File List

SamplerE -- compiled code for Macintosh computers
SamplerE.exe -- compiled code for PCs
pnatsc.txt -- example dataset (ASCII text)
samplerE.cp -- source code (ASCII text)
samplerE.h -- source code (ASCII text)
vplib.cp -- source code (ASCII text)
vplibE.h -- source code (ASCII text)


Description

The files contain executable code for Macintosh (SamplerE) and for Windows (SamplerE.exe), source code in C++ (samplerE.cp, samplerE.h, vplibE.cp, vplibE.h) and data (pnatsc.txt).

The program SamplerE implements the method for testing classification sharpness combined with sampling sufficiency evaluation. It is a simplified version of a larger application (Sampler) that offers a wider range of options for sampling sufficiency evaluation (see Orlóci and Pillar 1989, Pillar 1998, 1999).

SamplerE can run on any Macintosh PowerPC and older 68040, 68030 or 68020 CPUs with a floating-point unit. SamplerE.exe can run on Windows 95 and newer systems.

The data file pnatsc.txt (Tcacenco and Pillar 1996) contains cover percentage of 12 species (or species groups) in 37 sampling units (quadrats). The quadrats are 30 x 90 m in size, laid on anthropogenic grasslands in Santa Catarina state, Brazil. The data is arranged in a matrix with 37 rows and 12 columns. The species list is included at the end of the file, but is not used by the program.

The files with source code in C++ are not needed if you are using the executable code. The core of the program is in SamplerE.cp and SamplerE.h, which are not platform specific. All instructions that may be platform specific are in vplibE.cp and vplibE.h.

The user interface is menu driven on a text screen, as in the example below using data file pnatsc.txt (input from the keyboard is given in italic):

SAMPLER
For bootstrap resampling and sampling sufficiency evaluation, version 22/Sep/99
 
By V.Pillar
Departamento de Ecologia, Universidade Federal do Rio Grande do Sul
Porto Alegre, RS, Brazil 91540-000
e-mail: vpillar@ecologia.ufrgs.br
-----------------------------------------------------------------
Shorter version, only for evaluation of group partition sharpness.
Method described in Pillar, V.D. How sharp are classifications?
Ecology 80(8), 1999.
 
SPECIFYING DATA:
-----------------------------------------------------------------
Data file name: pnatsc.txt
Number of sampling units: 37
Number of variables: 12
Each row in the data matrix is a sampling unit(N) or is a variable(T)? n
 
-----------------------------------------------------------------
MAIN MENU
Data file: pnatsc.txt
S bootstrap resampling
N specify another data file
-----------------------------------------------------------------
Enter option: s
 
Will evaluate sharpness of group structure.
Clustering method:
(1)simple linkage
(2)complete linkage
(3)minimum variance
Enter option: 3
Enter the number of groups to be monitored: 4
Initial sample size: 37
 
Enter number of iterations in resampling: 1000
Initialization of random number generation:
(1) automatic
(2) specify seed
Enter option: 1
Save intermediate results? y/n n
Results saved on file Prinda.txt

Input data must be on a text file, arranged in a matrix with sampling units in rows or in columns. Each datum must be separated by at least a white space or tab. Place the data file in the same folder with the program. This version of the program assumes the variables are quantitative or binary (though the method is also applicable to other data types).

If the specified data file is successfully read, the option for bootstrap resampling is available. Also, a new data file may be specified. After selecting option 'S' choose the clustering method. The methods available in this version are single linkage, complete linkage and minimum variance, but other hierarchical or non-hierarchical methods could be implemented in the program. Specify then the number of groups (the partition level).

If no evaluation of sample size sufficiency is needed, specify the "Initial sample size" equal to the total number of sampling units (as in the example above).

However, if an evaluation of sample size sufficiency is desired, as in the example below, bootstrap resampling is performed with increasing sample size and the "Initial sample size" is smaller than the total number of sampling units, in which case the "Number of sampling units added at each sampling step" is also specified. The initial sample size must be at least the partition level plus one sampling unit. At each sampling step, the program will take bootstrap samples increased by the chosen number of sampling units. This number should be selected so to produce enough sampling steps for a smooth probability profile. In the example below with 1 sampling unit added at each step, the process will have 33 sampling steps after an initial step with 5 sampling units:

...
Initial sample size: 5
Number of sampling units added at each sampling step: 1
...

At least 1000 bootstrap resampling iterations are recommended. The initialization of the random number generator should usually be automatic, in which case repeated runs with same data will not give identical probabilities. Specify the same initialization number if you wish to get identical results with same data and options. For learning purposes, intermediate results of bootstrap resampling may be saved on the output file, but use this option only with a small number of iterations.

Numerical results are not shown on screen but automatically saved on file Prinda.txt, which may be open by any text editor. Several runs may be cumulatively saved on the same Prinda.txt file. The first example described above gave the following results:

SAMPLER
Bootstrap resampling
-----------------------------------------------------------------
 
Data file name: pnatsc.txt
Number of variables: 12
Total number of sampling units: 37
Sample attribute: sharpness of group structure
Cluster analysis method: (3)minimum variance
Considering partitions with 4 groups.
 
Reference partition (to be tested by bootstrap resampling) with 4 groups generated by cluster analysis:
Sampling units: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Groups: 1 1 2 1 2 2 1 3 4 2 3 4 1 4 3 1 2 3 3 4 3 1 3 3 2 3 3 2 1 1 3 3 1 3 1 1 4
Fri Oct 1 17:39:22 1999
Elapsed time: 13.55 seconds
Initializer of pseudo-random number generator: 3147788347
Sample size at 1 sampling step(s):
37
Analysis considering 4 groups:
Average of sample attribute generated by 1000 random iterations of bootstrap resampling:
0.967777
Probabilities P(GNull<=G) generated in 1000 iterations of bootstrap resampling:
0.11


References:

Orlóci, L., and V. D. Pillar. 1989. On sample size optimality in ecosystem survey. Biometrie-Praximetrie 29:173-184.

Pillar, V. D. 1998. Sampling sufficiency in ecological surveys. Abstracta Botanica 22: 37-48.

Pillar, V. D. 1999. The bootstrapped ordination reexamined. Journal of Vegetation Science 10 (In press).

Tcacenco, F. A., and V. D. Pillar. 1996. Vegetation-environment relations in anthropogenic grasslands of northeastern Santa Catarina, Brazil. Coenoses 11: 103-108.


Erratum (25 August 2000)

An error was detected in the programs published originally in Ecological Archives E080-014. With some data sets the bug may cause wrong results when the user selects minimum variance as clustering method. The bug has not affected the results published in the paper.

The following files should be updated:

samplerE.cp (source code)
samplerE.h (source code)
samplerE.exe (executable for Windows)
SamplerE (executable for Macintosh)
SamplerE.zip (all files, including the ones that were not changed)

Updated files can be downloaded here.

No changes are needed in the other files. The bug correction involved adding about 20 lines of code in file samplerE.cp and one function declaration in file samplerE.h. If you need to locate the changes in file samplerE.cp, please look for the comment "NOT IN THE ORIGINAL VERSION PUBLISHED IN ESA ECOLOGICAL ARCHIVES"


ESA Publications | Ecological Archives | Permissions | Citation | Contacts