Ecological Archives A014-023-A1

Elizabeth E. Holmes. 2004. Beyond theory to application and evaluation: diffusion approximations for population viability analysis. Ecological Applications 14:1272–1292.

Appendix A. Kalman filtering for maximum likelihood estimation given corrupted observations.

Introduction

The Kalman filter is used to extend likelihood estimation to cases with hidden states, such as when observations are corrupted and the true population size is unobserved.  The following algorithm is based on section 3.4 in Harvey (1989), which was used by Lindley (2003) for estimation for population processes.  The Kalman filter is well-known and widely used in engineering and computer science applications.  There are a multitude of books on the Kalman filter, including Harvey (1989).  One of the more penetrable introductions of the Kalman filter alone (but not on maximum likelihood estimation) is chapter 1 of Maybeck (1979).

 

The state-space model

The diffusion approximation for a stochastic exponential growth model can be written as a linear state space model (written in the notation familiar in the engineering literature):

(A.1)
   
(A.2)

where xt = log Nt is the true log population size and yt = log Ot is the log observations of the population size.  B is , the mean population growth rate.  Q is the 2, otherwise known as the process error or environmental variability.  R is the variability associated with sampling error or other non-process error.  Only yt, is observed; the underlying parameters, B, Q, and R, and the underlying true population size, xt, is hidden.  If we make the assumption that vt is normally distributed, then the model is a linear Gaussian state-space model.

We can calculate the probability of the observed time series, , as follows:

(A.3)

where  is the probability of yt given all the observations before time t and .  Denote the expected value of  as .  The conditional probability  is distributed normal with a mean  and some variance, denoted , which depends on the particular parameters, = {B, Q, R}, that generated the data.  Thus, the probability of the time series given a particular set of parameters, , is

(A.4)

from the probability density of a normal with mean  and variance .  The log likelihood of given the data, , is

+ a constant.
(A.5)

For Eq. A.5, we need estimates of = and =.  Observe from Eq. A.1 that

(A.6)

The Kalman filter below gives optimal estimates of  and  which are then used in Eq. A.6 to calculate the log likelihood of .

The maximum likelihood estimates of B, Q, and R are found by using some type of maximization routine on Eq. A.6 to find the set of parameters  = {B, Q, R} that maximize the likelihood.  Matlab code for this algorithm is given at the end of this appendix.

 

The Kalman filter

First, some notation:

The Kalman recursion:  Start at t = 1 and step forward to T.  Assume an initial  and initial  to start the recursion.  One could let these be free variables and find the maximum likelihood values when maximizing Eq. A.6, but that is not done here and the algorithm should not be very sensitive to these starting values.  At each step, compute:

This is the well-known Kalman filter, but it looks a little different than what you’ll see in engineering texts.  First generally it is assumed that yt is a series of measurements from multiple instruments, thus the Kalman filter is always written in matrix form.  Here since yt is one measurement, it can be written in scalar form.  Second, the Kalman filter is usually presented for the model  .  In this application, A = 1, C = 1 and ut = 1, so the filter is simplified quite a bit.

 

Literature cited

Harvey, A. C. 1991.  Forecasting, structural time series models and the Kalman filter.  Cambridge University Press, Cambridge, UK.

Lindley, S. T. 2003. Estimation of population growth and extinction parameters from noisy data. Ecological Applications 13:806–813.

Maybeck, P. S.  1979.  Stochastic models, estimation and control.  Volume 1.  Academic Press, New York, New York, USA.

 

Matlab code

function [mu,sigma2,sigma2np]=kalman_ests(data)

y=log(data);

%Start with some reasonable initial parameter estimates

muest=mean(y(2:end)-y(1:(end-1)));

tmp1=var(y(2:end)-y(1:(end-1)));

tmp4=var(y(5:end)-y(1:(end-4)));

sigma2est=(tmp4-tmp1)/3;

sigma2npest=(var(y(2:end)-y(1:(end-1)))-max(0.0001,sigma2est))/2;

pi1=max(0.0001,sigma2est)+max(0.0001,sigma2npest); %var of y(1)

%log transform the variances so the search algorithm doesn’t give negative

% variances

startvals=[muest log(max(0.0001,sigma2est)) log(max(0.0001,sigma2npest))];

%fminsearch is a Nelder-Mead minimization matlab function

a=fminsearch('kalman_loglik',startvals,[],y,y(1),pi1);

MLmuest=a(1);

MLsigma2est=exp(a(2));

MLsigma2npest=exp(a(3));

function negloglik = kalman_loglik(v,y,initx,V1)

T=length(y);

B = v(1); %mu

Q = exp(v(2)); %s2

R = exp(v(3)); %s2np

%initialize

xtt=zeros(1,T);  Vtt=zeros(1,T); xtt1=zeros(1,T); Vtt1=zeros(1,T);

xtT=zeros(1,T);  VtT=zeros(1,T); J=zeros(1,T); Vtt1T=zeros(1,T);

Ft=zeros(1,T); vt=zeros(1,T);

%forward pass gets you E[x(t) given y(1:t)]

x10=initx; 

V10=V1;

for(t=1:T)

   if(t==1)

      xtt1(1) = initx; %denotes x_1^0

      Vtt1(1) = V1; %denotes V_1^0

   else

      xtt1(t) = xtt(t-1) + B; %xtt1 denotes x_t^(t-1); Harvey 3.2.2a

      Vtt1(t) = Vtt(t-1) + Q; %Harvey 3.2.2b

   end

   Kt = Vtt1(t)/(Vtt1(t)+R);

   Ft(t) = Vtt1(t)+R;

   vt(t) = y(t)-xtt1(t);

   xtt(t) = xtt1(t) + Kt*(y(t) - xtt1(t)); %Harvey 3.2.3a

   Vtt(t) = Vtt1(t)-Kt*Vtt1(t); %Harvey 3.3.3b

end

%Calculate negative log likelihood

negloglik = (1/2)*sum(vt.^2./Ft) + (1/2)*sum(log(abs(Ft))) + (T/2)*log(2*pi);



[Back to A014-023]