16
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
Découvre YouScribe en t'inscrivant gratuitement
Découvre YouScribe en t'inscrivant gratuitement
16
pages
English
Documents
Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus
Ergodic averages for monotone functions using upper
and lower dominating processes
Jesper Møller
Department of Mathematical Sciences, Aalborg University, Denmark.
Kerrie Mengersen
Department of Mathematical Sciences, Queensland University of Technology, Australia.
Summary. We show how the mean of a monotone function (defined on a state space equipped
with a partial ordering) can be estimated, using ergodic averages calculated from upper and
lower dominating processes of a stationary irreducible Markov chain. In particular, we do not
need to simulate the stationary Markov chain and we eliminate the problem of whether an
appropriate burn-in is determined or not. Moreover, when a central limit theorem applies, we
show how confidence intervals for the mean can be estimated by bounding the asymptotic
variance of the ergodic average based on the equilibrium chain. Our methods are studied in
detail for three models using Markov chain Monte Carlo methods and we also discuss various
types of other for which our methods apply.
Keywords: Asymptotic variance; Bayesian models; Burn-in; Ergodic average; Ising model;
Markov chain Monte Carlo; Mixture model; Monotonocity; Perfect simulation; Random walk;
Spatial models; Upper and lower dominating processes
1. Introduction
Suppose that is a given target distribution on a state space
and we wish to estimate
the mean Z
= (x)(dx) (1)
for a given real function . In many applications it is not known or at least not straight-
forward to generate a stationary chain, so instead a non-stationary chain Y ; Y : : : is gen-1 2
erated by Markov chain Monte Carlo (MCMC) and is estimated by the ergodic average
PN
(Y )=(N M), where M 0 is an \appropriate" burn-in and N M is \suf-tt=M+1
cien tly" large, (see, for example, Robert and Casella 2004). This estimator is consistent
provided the chain is irreducible and M is independent of the Y chain. The problem is to
determine M and N so that the estimator is close to with a high degree of con dence.
Propp and Wilson (1996) showed how upper and lower dominating processes can be
used for generating a perfect (or exact) simulation of a stationary Markov chain at a xed
time, provided the chain is monotone with respect to a partial ordering on
for which
there exists unique maximal and minimal states. In this paper we introduce similar ideas
but our aim is to obtain reliable estimates of mean values rather than perfect simulations.
More speci cally , we consider irreducible Markov chains with as the invariant distri-
bution and make the following additional assumptions. Let X = (X ; t = 1; 2; : : :) denotet
the possibly unknown equilibrium chain, i.e. X and hence X for all t 1, and1 t2 Møller and Mengersen
let
tX1 = (X )t s
t
s=1
denote the ergodic average estimating . Assume there exist stochastic processes U =
(U ; t = 1; 2; : : :) and L = (L ; t = 1; 2; : : :) so thatt t
L U ; t = 1; 2; : : :; (2)tt t
where the ergodic averages
t tX X1 1L U = (L ); = (U ) (3)s st tt t
s=1 s=1
are consistent estimators of . Though U and L will be Markov chains in most of our
detailed examples, they do not need to be so as exempli ed in Section 4.1 (explaining why
we write \processes"). To ensure (2) we assume that with respect to a partial ordering
on , U and L are bounding X, i.e.
L X U ; t = 1; 2; : : : ; (4)t t t
and is monotone
x y =) (x) (y) (5)
(or, as discussed later on, is a linear combination of monotone functions). Then (2) holds,
L U and so it su ces to consider the processes ( ; t = 1; 2; : : :) and ( ; t = 1; 2; : : :). Con-t t
sequently, we do not need to simulate the equilibrium chain and we eliminate the problem
of whether an appropriate burn-in is determined or not. Assuming a central limit theorem
applies, we show how con dence intervals for the mean can be estimated by bounding the
asymptotic variance of . Note also that to assess if the process ((X ); t = 1; 2; : : :) hast t
stabilised into equilibrium, it su ces to consider the processes ((L ); t = 1; 2; : : :) andt
((U ); t = 1; 2; : : :). Our methods are studied in detail for three models using MCMCt
methods and we also discuss various types of other models for which our methods apply.
Note that in contrast to the Propp-Wilson algorithm we do not require that U andt
L coalesce for all su cien tly large t. Equivalently, we do not require that X is uniformlyt
ergodic (Foss and Tweedie, 1998). For extensions of the Propp-Wilson algorithm which
may be of relevance for our methods, see the references in Section 5.
The paper is organised as follows. Section 2 presents our ideas in a simple setting for
a random walk, while Section 3 considers a general setting. Section 4 illustrates how our
methods apply on the Ising model and a mixture model in which the weights are unknown.
Finally, Section 5 discusses extensions and application areas of the methods.
2. A simple example
Despite its conceptual ease, the random walk example below is a challenging platform on
which to evaluate the performance of our proposed methods in Section 3.Ergodic averages for monotone functions 3
2.1. Upper and lower bounds for a random walk
Consider a stationary random walk X = (X ; t = 1; 2; : : :) on a nite state space
=t
f0; 1; : : :; kg with transition probabilities
p = P(X = minfi + 1; kgjX = i) > 0;i t+1 t
q = P(X = maxfi 1; 0gjX = i) = 1 p > 0;i t+1 t i
for i = 0; 1; : : :; k, and invariant distribution = ( ; ; : : :; ) given by0 1 k
i 1Y
= p =q ; i = 1; : : :; k:i 0 j j+1
j=0
We can construct this by a so-called stochastic recursive sequence (SRS). Let X ; R ; R ; : : :1 1 2
be independent random variables with X and R Uniform [0; 1], t = 0; 1; : : :. De ne1 t
a so-called updating function :
[0; 1]!
by
(
minfi + 1; kg if r pi
(i; r) =
maxfi 1; 0g otherwise:
Then the SRS is given by
X = (X ; R ); t = 1; 2; : : ::t+1 t t
This construction allows us to bound the equilibrium chain by an upper chain U =
(U ; t = 1; 2; : : :) and a lower chain L = (L ; t = 1; 2; : : :) de ned byt t
U = k; U = (U ; R ); t = 1; 2; : : :;1 t+1 t t
L = 0; L = (L ; R ); t = 1; 2; : : : :1 t+1 t t
Thereby
L X U ; t = 1; 2; : : :; (6)t t t
and hence also for the ergodic averages
t t tX X X1 1 1 L = L ; X = X ; U = U ;t s t s t s
t t t
s=1 s=0 s=0
we have that
L X U ; t = 1; 2; : : :: (7)t t t
By irreducibility, as t grows, L and U converge to . Note that (4) and (5) are satis edt t
with given by and the identity function. Indeed (4)-(7) are satis ed if we replace X
by any Markov chain Y using the same coupling construction as above, i.e. when Y 2
is1
an arbitrary initial state and Y = (Y ; R ); t = 1; 2; : : :.t+1 t t4 Møller and Mengersen
2.2. Bounding the asymptotic variance for the ergodic average
PkIn this simple example, the mean = i is easily determined, and so there is no needii=1
for estimating it by an ergodic average. Moreover, it is of course easy to generate X from1
, and hence to generate X . However, in more complex situations as considered later int
Sections 3-5, the mean value of interest is unknown and it is usually hard to make a draw
from the invariant distribution. We can instead generate the upper and lower chains and
use (7) (or the extensions considered in the following sections) together with the following
considerations.
Since X is ergodic and
is nite, a central limit theorem (CLT) applies:
p
2t(X ) converges in distribution to Normal(0; ) as t!1 (8)t
where
1X
2
= <1; = Cov(X ; X ): (9)jtj t 1 t+1
t= 1
2We estimate using for example a window type estimator (Geyer, 1992) or batch means
(Ripley, 1987). For speci cit y, we consider in the sequel a window type estimator
mX
2^ = ^ (10)N jtj;N
t= m
based on X ; : : :; X , but similar considerations will apply for batch means. Here1 N
N tX1 ^ = (X X )(X X ); (11)t;N s+t N s N
N
s=1
see, for example, Priestly (1981, pp. 323-324). Geyer’s initial series estimator is given by
letting m = 2l + 1 where l is the largest integer so that the sequence ^ + ^ ,2t;N 2t+1;N
t = 0; : : :; l, is strictly positive, strictly decreasing and strictly convex. For an irreducible
2and reversible Markov chain this provides a consistent conservative estimator of , cf.
2Geyer (1992). By (6), (7) and (11), ^ is bounded from above and below byN
m mX X
2 2^ = a ; ^ = b ; (12)jtj;N jtj;Nmax;N min;N
t= m t= m
where for t 0,
N tX1 2 a = (U U L L L L + U )t;N s+t s s+t N s N N
N
s=1
and
N tX1 2 b = (L L U U U U + L )t;N s+t s s+t N s N NN
s=1
2are upper and lower bounds on ^ . As illustrated below, though ^ is more conser-t;N max;N
2vative than ^ , it can still provide a useful bound.NErgodic averages for monotone functions 5
2.3. Experimental results: Random walk
We illustrate the di culties with using these ergodic averages and the bounds of asymptotic
variances with a random walk when p = p is constant. Further experimental results arei
given in Section 3.3. The running mean X and corresponding upper and lower bounds Ut t
and L ; t = 1; : : :; N are shown in the left plot of Figure 1 for a run length of N = 10000t
iterations when k = 5 and p = 1=2, i.e. = 2:5. Corresponding upper and lower bounds on