Probability

September 7, 2014 4 分钟阅读

总结自 Coursera lecture Statistical Inference section 02 Probability。

1. NotationPermalink

Symbol	Definition	Example
$Ω$	The sample space, the collection of possible outcomes of an experiment	Die roll, $Ω = {1, 2, 3, 4, 5, 6}$
$E$	An event, a subset of sample space	Die roll is even, $E = {2, 4, 6}$
$ω$	An elementary or simple event, a particular result of an experiment	Die roll is a four, $ω = 4$
$ϕ$	The null event or the empty set

2. Interpretation of set operationsPermalink

1. $ω \in E$ implies that $E$ occurs when $ω$ occurs 2. $ω \notin E$ implies that $E$ does not occur when $ω$ occurs 3. $E \subset F$ implies that the occurrence of $E$ implies the occurrence of $F$ 4. $E \cap F$ implies the event that both $E$ and $F$ occur 5. $E \cup F$ implies the event that at least one of $E$ or $F$ occur 6. $E \cap F = ϕ$ means that $E$ and $F$ are mutually exclusive, or cannot both occur 7. $E^{c}$ or $\bar{E}$ is the event that $E$ does not occur

3. ProbabilityPermalink

A probability measure, $P$ , is a function from the collection of possible events so that the following hold:

For an event $E \subset Ω$ , $0 \leq P (E) \leq 1$
$P (Ω) = 1$
If $E_{1}$ and $E_{2}$ are mutually exclusive events, $P (E_{1} \cup E_{2}) = P (E_{1}) + P (E_{2})$ .

Part 3 of the definition implies finite additivity: $P (⋃_{i = 1}^{n} A_{i}) = \sum_{i = 1}^{n} P (A_{i})$ , where the $A_{i}$ are mutually exclusive.

4. Consequences (推论)Permalink

$P (ϕ) = 0$
$P (E) = 1 - P (E^{c})$
If $A \subset B$ , then $P (A) \leq P (B)$
$P (A \cup B) = P (A) + P (B) - P (A \cap B) = 1 - P (A^{c} \cap B^{c})$
$P (A \cap B^{c}) = P (A) - P (A \cap B)$
$P (⋃_{i = 1}^{n} E_{i}) \leq \sum_{i = 1}^{n} P (E_{i})$
$P (⋃_{i = 1}^{n} E_{i}) \geq m a x (P (E_{i}))$

5. Random variablesPermalink

A random variable is a numerical outcome of an experiment.
2 varieties
- Discrete
- Continuous
Discrete random variable are random variables that take on only a countable number of possibilities.
- $P (X = k)$
- described by PMF
Continuous random variable can take any value on the real line or some subset of the real line.
- $P (X \in A)$
- discribed by PDF and CDF

6. PMF: Probability Mass FunctionPermalink

A PMF evaluated at a value corresponds to the probability that a random variable takes that value (我觉得你可以理解为 $p (x) = P (X = x)$ ).

To be a valid PMF, $p$ must satisfy:

1. $p (x) \geq 0$ for all $x$ 2. $\sum_{x} p (x) = 1$ (The sum is taken over all of the possible values for $x$ )

E.g., let $X$ be the result of a coin flip where $X = 0$ represents tails and $X = 1$ represents heads. Suppose that we do not know whether or not the coin is fair. Let $θ$ be the probability of a head expressed as a proportion (between 0 and 1). Then we get:

p (x) = θ^{x} (1 - θ)^{1 - x}, for x = 0,1

PMF 其实就是分布律，用来描述 discrete random variable

7. PDF: Probability Density FunctionPermalink

A PDF, is a function associated with a continuous random variable.

Areas under the PDF correspond to probabilities for that random variable.

To be a valid PDF, $f$ must satisfy:

1. $f (x) \geq 0$ for all $x$

The area under $f (x)$ is 1

实际有： $P [a \leq X \leq b] = \int_{a}^{b} f (x) d x$

8. CDF: Cumulative Distribution Function & SF: Survival FunctionPermalink

The CDF of a random variable $X$ is defined as the function:

F (x) = P (X \leq x)

This definition applies regardless of whether is discrete or continuous.

The SF of a random variable $X$ is defined as:

S (x) = P (X > x)

Notice that $S (x) = 1 - F (x)$

For continuous random variables, the PDF is the derivative of the CDF, i.e. $f (x) = F^{'} (x)$

9. Example: Beta DistributionPermalink

More details in wikipedia Beta distribution.

If $x \in [0, 1]$ and $X \sim B e t a (α, β)$ , 我们称 $X$ 足 $α, β$ 控制的 beta 分布。 $α, β > 0$ and are both real number, a.k.a “shape parameters”.

PDF is defined as $f (x | α, β) = \frac{x^{α - 1} (1 - x)^{β - 1}}{B (α, β)}$

where constant $B (α, β) = \int_{0}^{1} t^{α - 1} (1 - t)^{β - 1} d t$

or with Gamma Funtion $Γ (n) = {\begin{cases} (n - 1)! & n is a positive integer \\ \int_{0}^{1} t^{α - 1} (1 - t)^{β - 1} d t & n is real or complex \end{cases}$

we can rewrite $B (α, β) = \frac{Γ (α) Γ (β)}{Γ (α + β)}$

If $α = 2$ and $β = 1$ , then $B (α, β) = \frac{1! \times 0!}{2!} = 0.5$ or $B (α, β) = \int_{0}^{1} t \cdot d t = \frac{1}{2} t^{2} |_{0}^{1} = 0.5$ , therefore $f (x) = 2 x$ and $F (x) = x^{2}$ .

我们在 R 中执行 pbeta(0.75, 2, 1) 会得到 0.5625。根据 R Generating Random Numbers and Random Sampling 里总结的规律，p 开头的都是求 CDF，i.e. $F (x)$ ，所以这里 pbeta(0.75, 2, 1) 的意义就是：当 $α = 2$ and $β = 1$ 时，求 $F (0.75)$ 。最终得到 $F (0.75) = 0.5625$ 。这和我们用 $f (x)$ 的面积来算的结果是一致的：

> x <- c(0, 1, 1, 1.5)
> y <- c(0, 2, 0, 0)
> plot(x, y, lwd = 3, frame = FALSE, type = "l", ylab="f(x)")
> abline(h=1.5, v=0.75)

F (0.75) = \frac{1.5 \times 0.75}{2} = 0.5625

X Facebook LinkedIn Bluesky

Probability

1. NotationPermalink

2. Interpretation of set operationsPermalink

3. ProbabilityPermalink

4. Consequences (推论)Permalink

5. Random variablesPermalink

6. PMF: Probability Mass FunctionPermalink

7. PDF: Probability Density FunctionPermalink

8. CDF: Cumulative Distribution Function & SF: Survival FunctionPermalink

9. Example: Beta DistributionPermalink

分享

留下评论

猜您还喜欢

Lark’s implementation of computing FIRST and FOLLOW sets

LL(1) Parsing

Top-Down Parsers: Recursive Descent, Predictive, and More

Appetizers Before Parsing: Serving Order