Section 5.7: Monte Carlo Method for Computing Integrals

Example: Proportion of Heavy Sleepers

What proportion of college students get at least eight hours of sleep? In section 2.4, we used a \(Beta(\alpha=3.26, \beta=7.19)\) prior, and since 11 out of the 27 students sleep a sufficient number of hours, then the posterior distribution for \(p\) is \(Beta(\alpha=14.26, \beta=23.19)\). Suppose we are interested in the posterior mean of \(p^2\). (This is the predictive probability that two students in a future sample will be heavy sleepers.)

p=rbeta(1000, 14.26, 23.19)
est=mean(p^2)
se=sd(p^2)/sqrt(1000)
c(est,se)
## [1] 0.149478172 0.001871916

The Monte Carlo estimate at \(E(p^2|data)\) is given by ‘est’, with an associated simulation standard error of ‘se’.

Section 5.8: Rejection Sampling

A general-purpose algorithm for simulating random draws from a given probability distribution is rejection sampling. In this setting, suppose we wish to produce an independent sample from a posterior density \(g(θ|y)\) where the normalizing constant may not be known. The first step in rejection sampling is to find another probability density \(p(θ)\) such that:

• It is easy to simulate draws from \(p\).

• The density \(p\) resembles the posterior density of interest \(g\) in terms of location and spread.

• For all \(\theta\) and a constant \(c\), \(g(θ|y) \le cp(θ)\).

Suppose we are able to find a density \(p\) with these properties. Then one obtains draws from \(g\) using the following accept/reject algorithm:

  1. Independently simulate \(\theta\) from \(p\) and a uniform random variable \(U\) on the unit interval.
  2. If \(U \le g(θ|y)/(cp(θ))\), then accept \(\theta\) as a draw from the density g; otherwise reject \(\theta\).
  3. Continue steps 1 and 2 of the algorithm until one has collected a sufficient number of “accepted” \(\theta\).

For example, suppose we want to simulate from the density \(f(x)=4x\) when \(0 \le x \le .5\) and \(f(x)=4(1-x)\) when \(.5 < x \le 1\).

triangular=function(x) {ifelse(x<0|x>1, 0, ifelse(x<=.5, 4*x, 4*(1-x)))}
curve(triangular(x),0,1)

## Using Uniform Distribution as the proposal density
iter=100000
u1=runif(iter)
u2=runif(iter)
prob.accept=triangular(u1)/2

accepted=u1[u2<=prob.accept]
plot(density(accepted))

length(accepted)/iter     # capture_rate
## [1] 0.49836
## Using Normal Distribution as the proposal density
curve(dnorm(x, mean=.5, sd=.2),0,1)
curve(triangular(x),0,1,add=T)

curve(1.3*dnorm(x, mean=.5, sd=.2),0,1)
curve(triangular(x),0,1,add=T)

curve(1.3*dnorm(x, mean=.5, sd=.25),0,1)
curve(triangular(x),0,1,add=T)

iter=100000
u1=rnorm(iter,mean=.5,sd=.25)
u2=runif(iter)
prob.accept=triangular(u1)/(1.3*dnorm(u1,mean=.5,sd=.25))

accepted=u1[u2<=prob.accept]
plot(density(accepted))

length(accepted)/iter     # capture_rate
## [1] 0.77019

For our Beta-Binomial example, consider the use of rejection sampling to simulate draws of \(\theta = (logit(\eta), log(K)\)) in the beta-binomial example.

library(LearnBayes)
data("cancermortality")
betabinexch=function (theta, data)
{
eta = exp(theta[1])/(1 + exp(theta[1]))
K = exp(theta[2])
y = data[, 1]
n = data[, 2]
N = length(y)
logf = function(y, n, K, eta) lbeta(K * eta + y, K * (1-eta) + n - y) - lbeta(K * eta, K * (1 - eta))
val = sum(logf(y, n, K, eta))
val = val + theta[2] - 2 * log(1 + exp(theta[2]))
return(val)
}
data(cancermortality)
fit=laplace(betabinexch,c(-7,6),cancermortality)

Instead of using multivariate normal density as the proposal density, we choose to use the density of the multivariate T-distribution because the T-distribution has heavier tails.

betabinT=function(theta,datapar)
{
data=datapar$data
tpar=datapar$par
d=betabinexch(theta,data)-dmt(theta,mean=c(tpar$m),S=tpar$var,df=tpar$df,log=TRUE)
return(d)
}

We define the parameters of the t proposal density and the list datapar:

tpar=list(m=fit$mode,var=2*fit$var,df=4)
datapar=list(data=cancermortality,par=tpar)

We run the function laplace with this new function and using an “intelligent” starting value.

start=c(-6.9,12.4)
fit1=laplace(betabinT,start,datapar)
fit1$mode
## [1] -6.888963 12.421993

We find that the maximum value d occurs at the value \(\theta\) = (−6.889, 12.422).

The value of d is found by evaluating the function at the modal value.

betabinT(fit1$mode,datapar)
## [1] -569.2829

We implement the Reject Sampling procedure:

theta=rejectsampling(betabinexch,tpar,-569.2813,10000,cancermortality)
dim(theta)
## [1] 2377    2
mycontour(betabinexch,c(-8,-4.5,3,16.5),cancermortality,xlab="logit eta",ylab="log K")
points(theta[,1],theta[,2])

logit.eta.90=quantile(theta[,1],c(.05,.5,.95))
exp(logit.eta.90)/(1+exp(logit.eta.90))
##           5%          50%          95% 
## 0.0006988362 0.0010643223 0.0017706640
mean(cancermortality[,1]/cancermortality[,2])      # Average mortality rate
## [1] 0.001100409

5.9 Importance Sampling

For example, suppose we want to find the expected value of \(\theta^2\), where \(\theta\) follows the triangular density described above. Analytically, we can calculate the appropriate integrals as follows:

e1=integrate(function(x){4*x^3},0,.5)
e2=integrate(function(x){4*x^2-4*x^3},.5,1)
e1$value + e2$value
## [1] 0.2916667

Or we can estimate it using the Importance Sampling procedure:

triangular=function(x){ifelse(x<0|x>1, 0, ifelse(x<=.5, 4*x, 4*(1-x)))}
temp=rnorm(10000,mean=.5,sd=.25)
w=triangular(temp)/dnorm(temp,mean=.5,sd=.25)
mean(temp^2*w)
## [1] 0.2929191

To illustrate the use of different proposal densities in importance sampling in our Beta-Binomial example, consider the problem of estimating the posterior mean of a function of \(\theta_2 = logK\) conditional on a value of \(\theta_1 = logit(\eta)\). The posterior density of \(\theta_2\), conditional on \(\theta_1\) is given by

\[g_1(\theta_2|data, \theta_1) \propto \frac{K}{(1+K)^2} \prod_{j=1}^{20} \frac{B(K\eta+y_j,K(1-\eta)+n_j-y_j)}{B(K\eta,K(1-\eta))} \]

where \(\eta = exp(\theta_1)/(1+exp(\theta_1))\) and \(K=exp(\theta_2)\). In the following, we write the function betabinexch.cond to compute this posterior density conditional on the value \(\theta_1=−6.818793\).

betabinexch.cond=function (log.K, data)
{
eta = exp(-6.818793)/(1 + exp(-6.818793))
K = exp(log.K)
y = data[, 1]; n = data[, 2]; N = length(y)
logf=0*log.K
for (j in 1:length(y))
{logf = logf + lbeta(K * eta + y[j], K * (1 - eta) + n[j] - y[j]) - lbeta(K * eta, K * (1 - eta))}
val = logf + log.K - 2 * log(1 + K)
return(exp(val-max(val)))
}

To compute the mean of \(logK\) for the cancer mortality data, suppose we let the proposal density p be normal with mean 8 and standard deviation 2.

I=integrate(betabinexch.cond,2,16,cancermortality)
par(mfrow=c(1,2))     # Displays multiple graphs 
curve(betabinexch.cond(x,cancermortality)/I$value,from=3,to=16, ylab="Density", xlab="log K",lwd=3, main="Densities")
curve(dnorm(x,8,2),add=TRUE)
legend("topright",legend=c("Exact","Normal"),lwd=c(3,1))
curve(betabinexch.cond(x,cancermortality)/I$value/ dnorm(x,8,2),from=3,to=16, ylab="Weight",xlab="log K", main="Weight = g/p")

Using the t-density as the proposal density

I=integrate(betabinexch.cond,2,16,cancermortality)
par(mfrow=c(1,2))     # Displays multiple graphs 
curve(betabinexch.cond(x,cancermortality)/I$value,from=3,to=16, ylab="Density", xlab="log K",lwd=3, main="Densities")
curve(dt(x,ncp=8, df=12),add=TRUE)
legend("topright",legend=c("Exact","T"),lwd=c(3,1))
curve(betabinexch.cond(x,cancermortality)/I$value/ dt(x,ncp=8,df=12),from=3,to=16, ylab="Weight",xlab="log K", main="Weight = g/p")

Suppose we want to estimate the mean of \(logK\). Then the Importance Sampling estimate can be computed as follows

sim.t=rt(10000,ncp=8,df=12)
wt=betabinexch.cond(sim.t,cancermortality)/I$value/ dt(sim.t,ncp=8,df=12)
his_bar=sum((sim.t)*wt)/sum(wt)
his_bar
## [1] 8.240172

Sampling Importance Resampling (SIR)

Useful for simulating values from the posterior distribution using values simulated from an appropriate proposal density (like the prior density).

Consider the triangular density example.

triangular=function(x){ifelse(x<0|x>1, 0, ifelse(x<=.5, 4*x, 4*(1-x)))}
temp=rnorm(10000,mean=.5,sd=.25)
w=triangular(temp)/dnorm(temp,mean=.5,sd=.25)
pr=w/sum(w)
sim.tri=sample(temp,size=12000,prob=pr,replace=T)
plot(density(sim.tri))

In our previous example, where we want to study \(logK\) when \(\theta_1=−6.818793\). We can simulate from the conditional distribution of \(\theta_2\) given \(\theta_1=−6.818793\) using SIR,

sim.t=rt(10000,ncp=8,df=12)
wt=betabinexch.cond(sim.t,cancermortality)/I$value/ dt(sim.t,ncp=8,df=12)
probs=wt/sum(wt)
sim.logK=sample(sim.t,size=12000,prob=probs,replace=T)
plot(density(sim.logK))

mean(sim.logK)
## [1] 8.320694