# 20 Confidence and Intervals

Some quick definitions to begin. Let’s say we have made an estimate from a model. To keep things simple, it could just be the sample mean.

A *Confidence interval* is the range within which we would expect the ‘true’
value to fall, 95% of the time, if we replicated the study.

A *Prediction interval* is the range within which we expect 95% of new
observations to fall. If we’re considering the prediction interval for a
specific point prediction (i.e. where we set predictors to specific values),
then this interval woud be for new observations *with the same predictor
values*.

A Bayesian *Credible interval* is the range of values within which we are 95%
sure the true value lies, based on our prior knowledge and the data we have
collected.

### The problem with confidence intervals

Confidence intervals are helpful when we want to think about how *precise our
estimate* is. For example, in an RCT we will want to estimate the difference
between treatment groups, and it’s conceivable we would to want to know, for
example, the range within which the true effect would fall 95% of the time if we
replicated our study many times (although in reality, this isn’t a question many
people would actually ask).

If we run a study with small N, intuitively we know that we have less information about the difference between our RCT treatments, and so we’d like the CI to expand accordingly.

So — all things being equal — the confidence interval reduces as we collect more data.

The problem with confidence intervals comes about because many researchers and clinicians read them incorrectly. Typically, they either:

Forget that the CI represents only the

*precision of the estimate*. The CI*doesn’t*reflect how good our predictions for new observations will be.Misinterpret the CI as the range in which we are 95% sure the true value lies.

### Forgetting that the CI depends on sample size.

By forgetting that the CI contracts as the sample size increases, researchers can become overconfident about their ability to predict new observations. Imagine that we sample data from two populations with the same mean, but different variability:

```
set.seed(1234)
df <- expand.grid(v=c(1,3,3,3), i=1:1000) %>%
as_data_frame %>%
mutate(y = rnorm(length(.$i), 100, v)) %>%
mutate(samp = factor(v, labels=c("Low variability", "High variability")))
Warning: `as_data_frame()` is deprecated, use `as_tibble()` (but mind the new semantics).
This warning is displayed once per session.
```

```
df %>%
ggplot(aes(y)) +
geom_histogram() +
facet_grid(~samp) +
scale_color_discrete("")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
```

- If we sample 100 individuals from each population the confidence interval around the sample mean would be wider in the high variability group.

If we increase our sample size we would become more confident about the location of the mean, and this confidence interval would shrink.

But imagine taking a single *new sample* from either population. These samples
would be new grey squares, which we place on the histograms above. It does not
matter how much extra data we have collected in group B or how sure what the
mean of the group is: *We would always be less certain making predictions for
new observations in the high variability group*.

The important insight here is that *if our data are noisy and highly variable we
can never make firm predictions for new individuals, even if we collect so much
data that we are very certain about the location of the mean*.