Hypothesis Testing I

Lecture 21

Dr. Elijah Meyer

Duke University
STA 199 - Spring 2023

March 31st, 2023

Checklist

– Clone ae-20

– HW 4 (Due Friday)

– HW-6 (Statistics Experience) due April 28th

– Project Draft Report due April 7th

– HW 5 Released Wednesday

– Good Conversation in Slack! Check it out

Warm Up: Confidence Intervals

– Why do we make confidence intervals?

– What two ways have we learned to make confidence intervals?

Bootstrap

– Simulation techniques

— Needs independence and n > 10

— Pick quantiles off a bootstrap resample distribution

Theory Based (Normal Distribution)

– Normal distribution if sigma is known; response variable is categorical; or if sample size is really large (typically over 100)

— Use qnorm in R to pull quantiles off standard normal distribution

— Summary Statistic +/- Margin of Error

— Summary Statistic +/- zscore * SD(Summary Statistic)

— SD(Summary Statistic) = \(\frac{\sigma}{\sqrt(n)}\)

Theory Based (t Distribution)

– t-distribution if sigma is unknown; response variable is quantitative, and if sample size is small

— Use qt in R to pull quantiles off t-distribution.

— Set degress of freedom using n-1

— Summary Statistic +/- tscore * SE(Summary Statistic)

— SE(Summary Statistic) = \(\frac{s}{\sqrt(n)}\)

Goals for Today

Hypothesis Testing

– Why

– How

— Single mean case

— Lab 9 will be an extension to the difference in mean case

Hypothesis Testing

– Is our population parameter different than a number?

— Different than another population parameter?

— Much like confidence intervals, we can do this for categorical and quantitative variables

Examples

Before: We want to estimate the true mean flipper length of penguins

Now: We want to test to see if the true mean flipper length of penguins is

– >

– <

\(\neq\)

Null and Alternative

– Null - We assume that our population parameter is equal to our null value (assume nothing is going on)

– Alternative - is our research question

Single Mean Case

Suppose we want to know if the true mean airbnb prices in NC was larger than 60. Set up the null and alternative hypotheses below.

Hint: We are at the population level. Use population parameters…

Ho:

Ha:

Single Mean Case

Ho: \(\mu\) = 60

Ha: \(\mu\) > 60

The Process

– Set up a null and alternative hypothesis

– Collect data and calculate summary statistic

– See how unlikely the statistic is under the assumption of the null hypothesis

– Make a decision and conclusion in the context of the null and alternative

ae - 20

For Lab

– Difference in means

– Type 1 error (\(\alpha\))

– Multiple Testing Correction

Null and Alternative

– For a difference in means…