Erle C. Ellis, Rong Gang Li, Lin Zhang Yang and Xu Cheng. 2000. Long-term Change in Village-Scale Ecosystems in China Using Landscape and Statistical Methods. Ecological Applications 10: 1057-1073.

Ecological Archives A-010-006

Data Quality Pedigree Calculator.
Notes on the Pedigree Calculation Algorithm

 This document describes the algorithm for calculating data quality pedigrees using the Microsoft Excel® Visual Basic program referenced in Ellis et al. (2000) and supplied in the Microsoft Excel® 97 workbook file: pedigree.xls.  Application of pedigree.xls is described in the file: pedigree.doc.  Parameters used in pedigree calculations are described in Table 1, and the algorithm is outlined in Table 2 and FIG 1, and in the text, below. 

When variables are added, the pedigree for their sum is the mean weighted average of their individual pedigrees (MWA, Table 1).  When variables are multiplied or divided, a “weak-link” principle is applied, and the minimum pedigree of all variables in the calculation is used as the pedigree of the result.  Pedigrees for subtraction and weighted averaging depend on whether the spread (S, Table 1) of the variables is similar to the difference between them (D, Table 1).  

When the spread of variables is similar in magnitude to their difference, as tested by the inequality D/S < 5, then subtraction of the variables yields a result that is more uncertain than the original variables themselves.  In this case, the pedigree for the results of subtraction should be decreased below the MWA as described in Table 2 and FIG 1.  On the other hand, when multiple estimates for a variable are combined using weighted averaging, and D/S < 5, this indicates that the estimates agree, so that the pedigree for their weighted average should be increased above the MWA of the estimates.  When calculating pedigree upgrades for weighted averages, we use the difference between the MWA and the maximum possible pedigree (MAP, Table 1), so that lower pedigrees are increased more than higher ones.  A slight, 5% per variable, pedigree increase is granted to all weighted averages, regardless of whether D/S < 5, so that data quality always increases when multiple independent estimates are used.  Absolute limits are set on pedigree increases for weighted averages, with a maximum increase of 1 for inverse variance weighting and 0.5 for subjective weighting (equal weights or not), because the former represents higher quality estimates.  In all calculations, each of the three scores within each pedigree is calculated independently.   


TABLE 1: Variables used by the Pedigree Calculation Algorithm.



The difference between variables

m is the number of variables, and  is the mean of the ith variable in the calculation. 

  Untitled.gif (1426 bytes) 

The number of variables in a calculation.  Used in FIG 1. 


The average spread of variables. 

S = 2 × Ã 


The mean weighted average pedigree.  Pi = pedigree of the ith variable, = mean of the ith variable, and n = the total number variables in the calculation. 



Weighted average of pedigrees. 

Pi = pedigree of the ith variable, Wi = weight of the ith variable, and n = the total number variables in the calculation. 



The maximum possible pedigree. 

MAP = {4,4,4} 


TABLE 2: Rules for calculating pedigrees

I.  Multiplication & Division
A.  Use the minimum of all variables.

II.  Addition (simple, no comparisons, averaging, etc.)
A.  Use the MWA pedigree of the variables.

III.  Subtraction (simple).  Decrease the pedigrees when the difference between variables is similar to their spread, as judged from D/S:
A.  IF D/S  > 5, use the MWA.
B.  IF D/S) < 2, reduce the MWA by 50% .
C.  IF 5 > D/S > 2, reduce the MWA by: [(5 - D/S) × %50 / 3].

IV.  Weighted Averages 
A.  When multiple estimates are averaged, increase the MWA by the number of estimates × 5% × the difference between the MWA and the maximum possible pedigree, MAP = {4,4,4}.
B.  When multiple independent estimates are combined using inverse variance weights ("_Vweight" designation in variable name), increase the pedigree when mean values are similar, as judged from D/S:
1.  IF D/S > 5, no change in pedigree, use the MWA.
2.  IF D/S < 2, increase the MWA by 50% * the difference between the average and the max values.
3.  IF 5 > D/S > 2, increase the average of grades/pedigrees weighted by mean values by: [(5 - D/S) × %50 / 3] × MAP.
4.  Limit the total increase to 1 unit per score.
C.  When multiple independent estimates are combined using subjective or other weights ("_Sweight" designation in variable name):
1.  Apply method B, above.
2.  Limit the total pedigree increase to a maximum of 0.5 units per score.
D.  Use the maximum pedigree calculated by methods A B or C.



FIG 1: Flowchart of the Pedigree Calculation Algorithm.