What are Confidence Intervals?
I don’t find the concept of confidence intervals to be very straightforward and I’ll admit it took me a while to get my head wrapped around them. To try and explain them, let’s look at the following process.
- Sample a population.
- Calculate your statistic from the sample. For sake of simplicity we can just go with the sample mean.
- Calculate confidence interval around sample. I realize I haven’t explained what a confidence interval is, but bear with me. For now just know it is a range of values.
- Is the population parameter (assuming it is known) inside the confidence interval?
- Repeat a large number of times (let’s call that N for now).
Once this experiment is done sum up the number of samples where the parameter fell inside the interval (step 4) and divide by the total number of repetitions (N). This provides you with your confidence coefficient. So what does that actually mean? When someone says 95% confidence interval - the 95% is the confidence coefficient. It means the following:
- If I take a sample from a population and calculate the statistic and confidence interval around the statistic, there is a 95% chance that interval contains the population parameter.
So the calculation of the statistic and the confidence interval answer the following questions:
- What is the estimated value of the population parameter? (i.e. the statistic)
- How precisely did this sample estimate the parameter? (confidence intervals surrounding the statistic)
The first question is pretty straightforward, but the second could use a bit of explanation. Let’s start with how a confidence interval is calculated.
\(\bar{x} \pm t_{95} * se\)
- \(\bar{x}\) - sample mean
- \(t_{95}\) - t-value at a given α value
- \(se\) - standard error
The α value is our critical value which is a value of 0.05 for a 95% confidence interval. We need to calculate the t-value at α/2 and multiply it times the standard error:
\(se_{\bar{x}} = {s \over{\sqrt{n}}}\)
The t-value is not larger than the critical value, i.e. null hypothesis is not rejected. We can think of this as there is a lack of evidence to suggest the sample statistic is different from the population parameter. All of the above snippet can be carried out using the built in t.test method.
In my initial explanation of CI’s I said if we repeated the sampling over and over again, in 95% of CI’s we calculate, the population mean (parameter) would be present. So let’s try this out.
We see the results in the image above. Only 5 of the trials have CI’s where the population parameter (mean) was not within the bounds. Note: If I were to do this trial of 100 samples over again, I may get a different number of trials outside the bounds, but on average should only be 5% of samples.
Bootstrap Estimation of Confidence Intervals
In addition to the traditional methods of calculating confidence intervals, we can do the same using bootstrapping. Bootstrap estimation of CI is accomplished by:
- Take a sample of the data of size n.
- Calculate the statistic (in this case the mean).
- Using the sample from step 1, you sample n samples with replacement.
- Calculate the mean of the new sample and calculate the difference between the new sample’s mean and the mean calculated in step 2.
- Repeat steps 3 and 4 many times (get a value for this).
- For all the mean differences calculated in steps 4 and 5, calculate the quantiles for the upper and lower tail. For example, if you want to know the 95% confidence intervals, calculate the 97.5% and 2.5% quantiles (in the case of a two tail test).
- Subtract the quantiles from the mean calculated in step 2. These are the confidence intervals.
We can see the width of our confidence intervals via bootstrapping are very close to those calculated using the t.test method. For more info on the justification of bootstrapping and why it won’t improve point estimates (estimating the mean), but is good at estimating the distribution of relative variation (the confidence intervals) see the Bootstrap Confidence Intervals MIT course reading.