Asymptotics: The Law of Large Numbers and The Central Limit Theorem

September 10, 2014 3 分钟阅读

总结自 Coursera lecture Statistical Inference section 07 Asymptotics。新的 slide 省略了部分推导过程，最好同时参考下旧的 slide。

0. AsymptoticsPermalink

Asymptotics，[æsɪmp’tɒtɪks] 渐近性，其实就是讲 $number of trials \to + \infty$ 时的一些性质。

1. The Law of Large NumbersPermalink

1.1 DefinitionPermalink

There are many variations on the LLN; we are using a particularly lazy version here.

The law of large numbers states that if $X_{1}, \dots X_{n}$ are iid from a population with mean $μ$ and variance $σ^{2}$ , then $\overset{―}{X}$ , the sample average of the n observations, converges in probability to $μ$ , i.e.

\begin{array}{r} \overset{―}{X} = \frac{1}{n} (X_{1} + \dots + X_{n}) \\ \overset{―}{X} \to μ when n \to \infty \end{array}

Or more generally, the average of the results obtained from a large number of trials (i.e. n, since we get an observation per trial) should be close to the expected value, and will tend to become closer as more trials are performed.

1.2 SimulationPermalink

n <- 10000; 
means <- cumsum(rnorm(n)) / (1:n) ## cumsum 累积求和，e.g. cumsum(c(1,2,3)) = c(1,3,6)
plot(1:n, means, type = "l", lwd = 2, frame = FALSE, ylab = "cumulative means", xlab = "sample size")
abline(h = 0)

1.3 Consistency and Bias of an estimatorPermalink

An estimator is consistent if it converges to what you want to estimate, i.e. $\hat{X} \to X$
- Consistency is neither necessary nor sufficient for one estimator to be better than another
- The LLN basically states that the sample mean is consistent
- The sample variance and the sample standard deviation are consistent as well
An estimator is unbiased if the expected value of an estimator is what its trying to estimate, i.e. $E [\hat{X}] = X$
- The sample mean is unbiased
- The sample variance is unbiased
- The sample standard deviation is biased (complicated proof. see Why is sample standard deviation a biased estimator of σ)

2. The Central Limit TheoremPermalink

2.1 DefinitionPermalink

CLT says

\overset{―}{X} \to \sim N (μ, \frac{σ^{2}}{n}) when n \to \infty

In another word

\begin{aligned} \frac{\overset{―}{X} - μ}{σ / \sqrt{n}} & = \frac{Estimate - Mean of estimate}{Std. Err. of estimate} \\ \to \sim N (0, 1) when n \to \infty \end{aligned}

2.2 Confidence intervalsPermalink

置信区间只在频率统计中使用。在贝叶斯统计中的对应概念是可信区间。

举例来说，如果在一次大选中某人的支持率为 55%，而置信水平 0.95 上的置信区间是（50%, 60%），那么他的真实支持率有 95% 的机率落在 50% 和 60% 之间，因此他的真实支持率不足一半的可能性小于 2.5%（假设分布是对称的）。

$[\overset{―}{X} - \frac{2 σ}{\sqrt{n}}, \overset{―}{X} + \frac{2 σ}{\sqrt{n}}]$ is called a 95% interval for $μ$ .

更多内容可以参考 Stat Trek: What is a Confidence Interval?。

2.3 Apply CLT to Bernoulli estimatorsPermalink

\begin{aligned} ∵ σ^{2} & = p (1 - p) \\ ∴ \frac{2 σ}{\sqrt{n}} & = 2 \sqrt{\frac{p (1 - p)}{n}} \\ ∵ p (1 - p) & \leq \frac{1}{4}, for 0 \leq p \leq 1 \\ ∴ \frac{2 σ}{\sqrt{n}} & = 2 \sqrt{\frac{p (1 - p)}{n}} \leq 2 \sqrt{\frac{1}{4 n}} = \frac{1}{\sqrt{n}} \end{aligned}

$∴ \overset{―}{X} \pm \frac{1}{\sqrt{n}}$ is a quick CI estimate for $p$ (since $μ = p$ in Bernoulli)

Exercise IPermalink

What is the probability of getting 45 or fewer heads out 100 flips of a fair coin? (Use the CLT, not the exact binomial calculation)

$μ = p = 0.5$
$σ^{2} = p * (1 - p) = 0.25, \frac{σ}{\sqrt{100}} = 0.05$
$\overset{―}{X} = \frac{45}{100} = 0.45$

pnorm(0.45, mean=0.5, sd=0.05)
## [1] 0.1586553

Exercise IIPermalink

Your campaign advisor told you that in a random sample of 100 likely voters, 56 intent to vote for you. Can you relax? Do you have this race in the bag?

$\overset{―}{X} = \frac{56}{100} = 0.56$
$\frac{1}{\sqrt{100}} = 0.1$
an approximate 95% interval of p is [0.46, 0.66]
Not enough for you to relax, better go do more campaigning!

2.4 Calculate Poisson interval with RPermalink

A nuclear pump failed 5 times out of 94.32 days, give a 95% confidence interval for the failure rate per day (i.e. $λ$ )?

poisson.test(x, T = 94.32)$conf
## [1] 0.01721 0.12371
## attr(,"conf.level")
## [1] 0.95

X Facebook LinkedIn Bluesky

Asymptotics: The Law of Large Numbers and The Central Limit Theorem

0. AsymptoticsPermalink

1. The Law of Large NumbersPermalink

1.1 DefinitionPermalink

1.2 SimulationPermalink

1.3 Consistency and Bias of an estimatorPermalink

2. The Central Limit TheoremPermalink

2.1 DefinitionPermalink

2.2 Confidence intervalsPermalink

2.3 Apply CLT to Bernoulli estimatorsPermalink

Exercise IPermalink

Exercise IIPermalink

2.4 Calculate Poisson interval with RPermalink

分享

留下评论

猜您还喜欢

Lark’s implementation of computing FIRST and FOLLOW sets

LL(1) Parsing

Top-Down Parsers: Recursive Descent, Predictive, and More

Appetizers Before Parsing: Serving Order