Math 216: Statistical Thinking
Key Question: How can we estimate population proportions with quantified uncertainty? Confidence intervals provide the mathematical framework for making reliable inferences about categorical data!
Real-World Applications:
Statistical Framework:
Key Insight: Confidence intervals quantify the precision of our estimates and provide a range of plausible values for the true population proportion!
Core Statistical Properties
Mean of \(\hat{p}\):
Standard Error of \(\hat{p}\):
Key Insight: These properties enable statistical inference from samples to populations!
Core Statistical Principle
Central Limit Theorem for Proportions: For sufficiently large samples, the sampling distribution of \(\hat{p}\) is approximately normal, regardless of the population distribution shape.
Mathematical Formulation: \[\hat{p} \sim N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)\]
Sample Size Requirements:
Statistical Significance: This universal principle enables confidence intervals and hypothesis testing for proportions!
Confidence Interval Formula
General Formula: \[\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]
Components:
Common Confidence Levels:
Key Insight: This formula provides a range of plausible values for the true population proportion!
Statistical Assumptions
Essential Conditions:
Verification Examples:
Example 1: Political Polling
Example 2: Market Research
Context: National election polling with 1200 voters, observed support = 48%
Statistical Analysis:
Weβre 95% confident the true support for Candidate A is between 45.2% and 50.8%
Context: Manufacturing defect rate monitoring with 500 products, observed defect rate = 2%
Quality Control Analysis: Weβre 95% confident the true defect rate is between 0.8% and 3.2%. If target defect rate is 1%, current process may need improvement
Context: Clinical trial for new treatment with 800 patients, success rate = 65%
Medical Research Analysis:
Weβre 95% confident the true treatment success rate is between 61.7% and 68.3%. The interval provides evidence for treatment effectiveness
Confidence Interval Calculations
Confidence Interval Formula: \[\text{CI} = \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]
Margin of Error: \[\text{ME} = z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]
Planning Your Study
Key Question: How large a sample do we need to estimate a proportion with desired precision?
Sample Size Formula: \[n = \left(\frac{z_{\alpha/2}}{ME}\right)^2 \cdot p(1-p)\]
Components:
Practical Application: This formula helps researchers plan studies with appropriate sample sizes to achieve desired precision!
Practical Exercises Using R
Exercise 1: Basic Confidence Interval Calculation
Sample data: 180 successes out of 400 trials
Calculate 95% CI:
R code: prop.test(180, 400, conf.level = 0.95)$conf.int
Exercise 2: Sample Size Determination
ceiling((qnorm(0.975)/0.03)^2 * 0.5 * 0.5)When p is Unknown
The Conservative Approach: When we have no prior estimate of \(p\), use \(p = 0.5\)
Why 0.5?:
Conservative Sample Size Formula: \[n = \left(\frac{z_{\alpha/2}}{ME}\right)^2 \cdot 0.25\]
Example: For 95% confidence and 3% margin of error: \[n = \left(\frac{1.96}{0.03}\right)^2 \cdot 0.25 = 1067.11 \rightarrow 1068\]
Key Insight: Conservative approach ensures adequate sample size when prior information is unavailable!
Practical Applications
Example 1: Political Polling
Example 2: Market Research
Key Insight: Using prior information can reduce required sample size significantly!
Practice Problems
Exercise 5: Basic Sample Size Calculation
Exercise 6: Sample Size with Prior Information
Exercise 7: Impact of Confidence Level
Desired margin of error: 5%
Compare sample sizes for:
Use conservative approach
Exercise 8: Real-World Planning
Practical Exercises Using R
Exercise 3: Confidence Level Impact
Exercise 4: Real-World Interpretation
Building on Statistical Foundations
Relationship to Sampling Distributions (Day 16-18):
Relationship to Confidence Intervals for Means (Day 19-20):
Statistical Continuity:
Essential Statistical Concepts:
Statistical Guidelines:
Next Topic: Applying these principles to hypothesis testing for proportions