Confidence Intervals + CLT

Lab-Lecture

Dr. Elijah Meyer

Duke University
STA 199 - Spring 2023

March 29th, 2023

Checklist

– Clone ae-19

– HW-4 Due this Friday

– HW-6 (Statistics Experience) due April 28th

– Project Draft Report due April 7th

– TEAMMATES Survey - Due tonight

Project Announcements

– You do not have to make changes to your proposal. Incorporate feedback into your report and close Issues after

– Merge conflicts are not always a bad thing…. and they are a part of group work

Homework Announcements

Yesterday around 3:00 PM…

– Question 4b wording was changed to be more clear

– Typo for question 7; should have students calculate the median race time in seconds (not age) before making the CI for median race time in seconds.

Warm Up

In the last AE, we calculated a 95% confidence interval for the true mean flipper length on penguins. We found it to be (199.514, 202.486).

Below, discuss if the center or spread of the confidence interval would change in the following situations:

– Go from a 95% to 90% CI

– Increase the sample size from 341 to 1000

– Found that our sample mean was 150 instead of 201

Last Time

We used the central limit theorem to calculate a 95% confidence of true mean flipper length to be between (199.514, 202.486) mm.

  • But this is only proper practice if the following conditions were satisfied:

  • Sample size

  • Independence

Type of Distribution

– In practice, if our sample size is “large”, we can trust the results from a normal distribution

– However, it is proper practice to use a t-distribution when working with a quantitative response variable

What is a t-distribution?

Student’s t-distribution is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situations where the sample size is small and the population’s standard deviation is unknown.

Degrees of Freedom

– Degrees of Freedom are a function of the sample size that dictates “how spread out” the t-distribution is

– For a single mean, it is calculated n - 1

– As t -> \(\infty\)…. we approach a normal distribution

In R

– We use the function qt

Notation + Definition Check

\(\mu\) = population mean

\(\bar{x}\) = sample mean

ae-19

– Understand what a t-distribution is and motivate it’s use

– Use the CLT and t-distribution to calculate confidence intervals

– Use R + a t-distribution to calculate a 95% confidence interval for flipper length

– (If time) - calculate confidence intervals for a difference in means