We consider data from a study (Belenky et al, 2003) of reaction time under a variety of sleep deprivation scenarios. We will focus on the baseline values of reaction time, taken after study participants had a normal night’s sleep. Suppose prior research has established a mean reaction time to a visual stimulus of 275ms, and we wish to determine whether the values in our study are consistent with this value. A first step is to read and plot the data.
library(tidyverse)library(lme4) # lme4 package contains sleep study datadata(sleepstudy)baseline <- sleepstudy %>%filter(Days==2) # Day 2 is baseline, days 0-1 are trainingbaseline %>%ggplot(aes(x=Reaction)) +geom_histogram() +labs(x="Reaction Time (ms)") +geom_vline(xintercept=275, col="red")
Analysis of Reaction Time
We can use the R brms() function to analyze the data. One advantage of this function is that it allows us easily to implement more complex sampling algorithms, beyond Gibbs sampling.
Suppose our data model is \(Y \sim N(\mu, \sigma^2),\) and we wish to evaluate whether \(\mu\) is consistent with a reaction time of 275ms. We previously studied the conjugate normal-inverse gamma specification, given by \[\mu \mid \sigma^2 \sim N\bigg(\theta,\frac{\sigma^2}{n_0}\bigg), ~~~\frac{1}{\sigma^2} \sim \text{gamma}\bigg(\frac{\nu_0}{2},\frac{\nu_0\sigma^2_0}{2}\bigg).\] While this prior has a nice interpretation in terms of prior samples, it’s not the only option.
Prior for the Mean
Instead, we consider a relatively non-informative prior for our mean \(\mu \sim N(300,50^2)\) – based on a normal distribution, this puts 95% of the prior mass within 2 SD of the mean, which roughly corresponds to the range 200-400.
Prior for the Standard Deviation
We will specify a prior directly on the standard deviation \(\sigma\) as \(\sigma \sim \text{HalfCauchy}(0,40)\), which is centered at 0 and has scale parameter 40 and is called “half” because only the positive values are used. If you haven’t seen the Cauchy distribution, it’s a special case of a t distribution with 1 degree of freedom, and it’s nice because it has fat tails and allows substantial mass away from its mean. We plot it here.
Code to Fit Model
fit <-brm(data = baseline, family =gaussian(), Reaction ~1, #only estimating a mean (Intercept) and varianceprior =c(prior(normal(300,50), class = Intercept), # prior for mu has mean 300 and sd 50prior(cauchy(0,40), class = sigma)# prior for sigma is half of a center 0, scale 40 Cauchy ))
A trace plot depicts the MCMC samples by MCMC iteration number. If your MCMC algorithm has converged, we expect to see trace plots that reflect white noise around the target value. Indeed, that appears to be the case here.
Bad Trace Plots
Chain A has not yet converged to the right value (see black line in density plot), while Chain B is getting stuck and moving slowly around the sample space. In these cases, think carefully about your model and make sure it’s appropriate, and if so, try sampling a larger number of iterations.
Comparing Sampled Values Across Chains
We also can check the \(\hat{R}\) values. These compare the sampled values across chains (we hope all the chains are giving similar results) by taking the square root of the ratio of the variability in the parameter of interest across all chains combined to the typical variability within a given chain. When \(\hat{R} \approx 1\), we have stability across the chains. While there is no universal rule, often \(\hat{R}>1.05\) can be a cause for concern.
Checking \(\hat{R}\)
```{r}fit # written summary of the output```
Family: gaussian
Links: mu = identity; sigma = identity
Formula: Reaction ~ 1
Data: baseline (Number of observations: 18)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 266.35 7.44 251.95 281.08 1.00 2433 2110
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 31.17 5.63 22.39 44.09 1.00 2116 2069
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Effective Sample Size
Markov chain simulations in general produce samples that are correlated, rather than independent. The degree of autocorrelation in samples is related to the speed of convergence of the chain. The effective sample size (ESS) is the number of independent samples you would need to get a similarly accurate posterior approximation. An effective sample size much smaller than the actual number of samples in a chain is an indication that our sampling is not very efficient. In the prior output, we see values labeled Bulk_ESS and Tail_ESS.
Effective Sample Size
The bulk ESS provides information about the sampling efficiency in the bulk of the distribution and is thus most relevant for providing information about efficiency of mean or median estimates.
Tail ESS computes the minimum of ESS of the 5% and 95% quantiles and is useful for determining sampling efficiency in the tails and of the variance.
There are no hard and fast rules, but one recommendation is that both values should be at least 100 per Markov chain.
Generally, we evalute quality of our samples based on all these criteria combined.
Evaluating Our Hypothesis
Examining the posterior distribution for \(\mu\), we see that most of its mass is below the value 275.
Evaluating Our Hypothesis
We can easily calculate the posterior probability that \(\mu<275\) by taking the fraction of posterior samples that are less than 275.
```{r}mean(as_draws_df(fit)$b_Intercept<275)```
[1] 0.878
So the posterior probability that \(\mu<275\) is 0.878.
Evaluating Our Hypothesis
We can also use the 95% credible interval for \(\mu\), given by (251.95,281.08) and note that 275 is inside this credible interval.
You Try It!
The priors specified are not very informative. Try making the priors more informative and running the code on your own to see how the results are affected!
For example, you can - make the variance on the normal prior much smaller - make the scale on the half Cauchy prior much smaller
Try changing just one aspect of the prior at a time to evaluate robustness.
Changes in Reaction Time with Sleep Deprivation
Now that we’ve explored baseline reaction time, let’s see what happens to the reaction times of study participants after one night of sleep deprivation (only 3 hours of time in bed). First, we will do a little data wrangling and make a plot of the changse in reaction time, plotting the sleep deprived reaction time minus the baseline reaction time for each subject.
sleepdep <- sleepstudy %>%filter(Days %in%c(2,3)) %>%pivot_wider(names_from = Days, names_prefix="Day", values_from = Reaction, id_cols=Subject) %>%mutate(diff32=Day3-Day2)# consider day 2 baseline and day 3, the first value from a sleep-deprived state, and calculate difference# positive differences indicate longer reaction times when sleep-deprivedsleepdep %>%ggplot(aes(x=diff32)) +geom_histogram() +labs(x="Difference in Reaction Time (ms) when Sleep Deprived") +geom_vline(xintercept=0, col="red")
You Try It!
Using the previous slides as a guide, evaluate the hypothesis that sleep deprivation increases reaction time.