Probability
总结自 Coursera lecture Statistical Inference section 02 Probability。
1. NotationPermalink
Symbol | Definition | Example |
---|---|---|
The sample space, the collection of possible outcomes of an experiment | Die roll, |
|
An event, a subset of sample space | Die roll is even, |
|
An elementary or simple event, a particular result of an experiment | Die roll is a four, |
|
The null event or the empty set |
2. Interpretation of set operationsPermalink
1.
3. ProbabilityPermalink
A probability measure,
- For an event
, -
- If
and are mutually exclusive events, .
Part 3 of the definition implies finite additivity:
4. Consequences (推论)Permalink
-
-
- If
, then -
-
-
-
5. Random variablesPermalink
- A random variable is a numerical outcome of an experiment.
- 2 varieties
- Discrete
- Continuous
- Discrete random variable are random variables that take on only a countable number of possibilities.
-
- described by PMF
-
- Continuous random variable can take any value on the real line or some subset of the real line.
-
- discribed by PDF and CDF
-
6. PMF: Probability Mass FunctionPermalink
A PMF evaluated at a value corresponds to the probability that a random variable takes that value (我觉得你可以理解为
To be a valid PMF,
1.
E.g., let
PMF 其实就是分布律,用来描述 discrete random variable
7. PDF: Probability Density FunctionPermalink
A PDF, is a function associated with a continuous random variable.
Areas under the PDF correspond to probabilities for that random variable.
To be a valid PDF,
1.
- The area under
is 1
实际有:
8. CDF: Cumulative Distribution Function & SF: Survival FunctionPermalink
The CDF of a random variable
This definition applies regardless of whether is discrete or continuous.
The SF of a random variable
Notice that
For continuous random variables, the PDF is the derivative of the CDF, i.e.
9. Example: Beta DistributionPermalink
More details in wikipedia Beta distribution.
If
PDF is defined as
where constant
or with Gamma Funtion
we can rewrite
If
我们在 R 中执行 pbeta(0.75, 2, 1)
会得到 0.5625。根据 R Generating Random Numbers and Random Sampling 里总结的规律,p 开头的都是求 CDF,i.e. pbeta(0.75, 2, 1)
的意义就是:当
> x <- c(0, 1, 1, 1.5)
> y <- c(0, 2, 0, 0)
> plot(x, y, lwd = 3, frame = FALSE, type = "l", ylab="f(x)")
> abline(h=1.5, v=0.75)
Comments