Ecological Archives E086-131-A1

Timothy H. Keitt and Dean L. Urban. 2005. Scale-specific inference using wavelets. Ecology 86:2497–2504.

Appendix A. Simulation study of wavelet coefficient regression.

To illustrate why strong associations between variables are observed at some scales and not others, we consider how wavelet coefficients from sequences sharing similar patterns covary at different scales. We begin with a few definitions, following the notation of Percival and Guttorp (1994). Given a sequence $Y_{i}$, we can quantify the pattern exhibited at a particular scale $k$ in terms of its Allen variance

\begin{displaymath}\sigma_{Y}^{2}(k)\equiv\frac{1}{2}E\left\{ \left[\bar{Y}_{i}(k)-\bar{Y}_{i-k}(k)\right]^{2}\right\} .\end{displaymath}

(A.1)

where $\bar{Y}_{i}(k)=\frac{1}{k}\sum_{j=0}^{k-1}Y_{i-j}$. If a sequence has a large Allen variance a particular scale $k$, but is small at all other scales, then we say that the sequence is patchy or patterned at scale $k$. The Haar wavelet transform of the sequence $Y$ yields wavelet coefficients

\begin{displaymath}d_{j,k}\equiv\frac{1}{\sqrt{2k}}\left[{\displaystyle {\displ......{\displaystyle Y_{2jk-l}}-\sum_{l=0}^{k-1}Y_{2jk-k-l}}}\right]\end{displaymath}

(A.2)

where $k$ is the scale of the transform and $j$ is the translation along the sequence. It is straightforward to show that

\begin{displaymath}\mathrm{var}\{ d_{j,k}\}=k\sigma_{Y}^{2}(k).\end{displaymath}

(A.3)

Thus, a large wavelet-coefficient variance at a given scale equates to a large Allan variance at that scale, which by definition, indicates the presence of pattern at that scale. Similar arguments extend this result to other wavelet bases as well (Percival and Guttorp 1994).

We now consider the wavelet-covariance of two sequences patterned at the same scale. Let $X$ be a sequence of length $N$ with a large Allen variance at scale $m$ and small Allen variance at all other scales. Furthermore, let $Y$ be a sequence that inherits the pattern in $X$ through the relationship $Y=\beta X+\epsilon$, where $\epsilon$ represents uncorrelated random errors. We can partition the covariance relationship between $X$ and $Y$ into a hierarchy of scales $k\in\left\{ 2,4,8,\ldots,N/2\right\} $ via the wavelet transform $\Psi_{k}$. It follows that

$\displaystyle \mathrm{cov}\left\{ \Psi_{k}\left(Y\right),\Psi_{k}\left(X\right)\right\}$

$\textstyle =$

$\displaystyle \mathrm{cov}\left\{ \Psi_{k}\left(\beta X+\epsilon\right),\Psi_{k}\left(X\right)\right\}$

 
 

$\textstyle =$

$\displaystyle \mathrm{cov}\left\{ \Psi_{k}\left(\beta X\right),\Psi_{k}\left(X\......{cov}\left\{ \Psi_{k}\left(\beta X\right),\Psi_{k}\left(\epsilon\right)\right\}$

 
 

$\textstyle =$

$\displaystyle \mathrm{\beta\, var}\left\{ \Psi_{k}\left(X\right)\right\} .$

(A.4)

Hence, the wavelet-covariance of $Y$ and $X$ will reach its maximum when $k=m$, i.e., at exactly the scale at which the variance of the wavelet coefficients of $X$ is maximal and the scale at which $Y$ and $X$ exhibit a shared pattern. We can easily extend these results to multivariate regression. It is the increase in covariance at certain scales that generates scale-specific models in our analysis.

To validate the approach of wavelet-coefficient regression (WCR), we applied the method to simulated data with known properties. In the following, we present these results.

We generated a test data set consisting of three independent variables and one dependent variable. The data are shown in Fig. A1. The three independent variables $X1$, $X2$, and $X3$ were generated as 1-dimensional periodic sequences, each with 1024 samples. We used a sine function to generate the values, but altered the periodicity with each sequence (we used periods of 8, 32, and 128 for the three variables). The independent variable was generated by simply summing the dependent variables and adding random noise. We used normally distributed errors with a standard deviation equal to 0.5.

\includegraphics[%scale=0.8]{simseries.eps}
 
   FIG. A1: Simulated test data. Independent variables $X1$ to $X3$ have periods of 8, 32 and 128 location units respectively. The dependent variable $Y$ is the sum of the independent variables plus normally distributed errors with a standard devation of 0.5 units.

 

We then wavelet transformed the simulated data using the Daubechies Least-Asymmetric wavelet (Daubechies 1992) of length 8, generating six new datasets representing the decomposition at first six levels. As described in the main text, we then selected a linear model by minimizing the AIC score across stepwise removals and additions of the independent variables.

Table A1 shows the results of the model selection and regression. Parameter estimates are given at each level of the transform; where an estimate is missing, that variable was omitted from the model by the model selection routine (in a couple of cases, the selected model generated a singularity in the estimation procedure and so the variable responsible for the singularity was also dropped). Clearly, the model selection approach is identifying the scales at which independent and dependent variables interact. Level 2 highlights the association with $X1$. Level 4 includes $X2$, and Level 6 shows a strong covariance between $Y$ and $X3$. Generally, the best fit occurs when the scale of the transform (defined as $2^{\ell}$ where $\ell$ is the transform level) is 1/2 the period of the input pattern. Often, there is also a less strong relationship at 1/4 the period, probably owing to resonance with the larger scale pattern.

TABLE A1. Regression results for simulated data.

 

$X1$

$X2$

$X3$

$F$

Level

Estimate

P

Estimate

P

Estimate

P

Value

P

0

$0.95\pm0.02$

$<2e^{-16}$

$1.02\pm0.02$

$<2e^{-16}$

$1.01\pm0.02$

$<2e^{-16}$

2086

$<2.2e^{-16}$

1

$0.93\pm0.21$

$8e^{-6}$

       

20.36

$8e^{-6}$

2

$0.93\pm0.10$

$<2e^{-16}$

$4.44\pm2.56$

0.084

   

43.19

$<2.2e^{-16}$

3

   

$1.15\pm0.22$

$<4.9e^{-7}$

   

28.32

$<4.9e^{-7}$

4

   

$1.01\pm0.03$

$<2e^{-16}$

   

1608

$<2.2e^{-16}$

5

       

$0.78\pm0.21$

0.001

13.51

0.001

6

       

$1.02\pm0.04$

$3.72e^{-9}$

733.8

$3.72e^{-9}$

 

LITERATURE CITED

Daubechies, I. 1992. Ten lectures on wavelets. CBMS-NSF Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, USA.

Percival, D. B., and P. Guttorp. 1994. Long-memory processes, the allan variance and wavelets. Pages 325–334 in E. Foufoula-Georgiou, editor. Wavelets in geophysics. Volume 4 of Wavelet analysis and its applications. Academic Press, San Diego, California, USA.



[Back to E086-131]