4 minute read

总结自 Coursera lecture Statistical Inference section 02 Probability


1. NotationPermalink

Symbol Definition Example
Ω The sample space, the collection of possible outcomes of an experiment Die roll, Ω={1,2,3,4,5,6}
E An event, a subset of sample space Die roll is even, E={2,4,6}
ω An elementary or simple event, a particular result of an experiment Die roll is a four, ω=4
ϕ The null event or the empty set  

2. Interpretation of set operationsPermalink

1.ωE implies that E occurs whenωoccurs 2.ωE implies that E does not occur when ω occurs 3.EF implies that the occurrence of E implies the occurrence of F 4.EF implies the event that both E and F occur 5.EF implies the event that at least one of E or F occur 6.EF=ϕ means that E and F are mutually exclusive, or cannot both occur 7.Ec or E¯ is the event that E does not occur

3. ProbabilityPermalink

A probability measure, P, is a function from the collection of possible events so that the following hold:

  1. For an event EΩ, 0P(E)1
  2. P(Ω)=1

  3. If E1 and E2 are mutually exclusive events, P(E1E2)=P(E1)+P(E2).

Part 3 of the definition implies finite additivity: P(i=1nAi)=i=1nP(Ai), where the Ai are mutually exclusive.

4. Consequences (推论)Permalink

  • P(ϕ)=0

  • P(E)=1P(Ec)

  • If AB, then P(A)P(B)
  • P(AB)=P(A)+P(B)P(AB)=1P(AcBc)

  • P(ABc)=P(A)P(AB)

  • P(i=1nEi)i=1nP(Ei)

  • P(i=1nEi)max(P(Ei))

5. Random variablesPermalink

  • A random variable is a numerical outcome of an experiment.
  • 2 varieties
    • Discrete
    • Continuous
  • Discrete random variable are random variables that take on only a countable number of possibilities.
    • P(X=k)

    • described by PMF
  • Continuous random variable can take any value on the real line or some subset of the real line.
    • P(XA)

    • discribed by PDF and CDF

6. PMF: Probability Mass FunctionPermalink

A PMF evaluated at a value corresponds to the probability that a random variable takes that value (我觉得你可以理解为 p(x)=P(X=x)).

To be a valid PMF, p must satisfy:

1.p(x)0 for all x 2.xp(x)=1 (The sum is taken over all of the possible values for x)

E.g., let X be the result of a coin flip where X=0 represents tails and X=1 represents heads. Suppose that we do not know whether or not the coin is fair. Let θ be the probability of a head expressed as a proportion (between 0 and 1). Then we get:

p(x)=θx(1θ)1x,for x = 0,1

PMF 其实就是分布律,用来描述 discrete random variable

7. PDF: Probability Density FunctionPermalink

A PDF, is a function associated with a continuous random variable.

Areas under the PDF correspond to probabilities for that random variable.

To be a valid PDF, f must satisfy:

1.f(x)0 for all x

  1. The area under f(x) is 1

实际有:P[aXb]=abf(x)dx

8. CDF: Cumulative Distribution Function & SF: Survival FunctionPermalink

The CDF of a random variable X is defined as the function:

F(x)=P(Xx)

This definition applies regardless of whether is discrete or continuous.

The SF of a random variable X is defined as:

S(x)=P(X>x)

Notice that S(x)=1F(x)

For continuous random variables, the PDF is the derivative of the CDF, i.e. f(x)=F(x)

9. Example: Beta DistributionPermalink

More details in wikipedia Beta distribution.

If x[0,1] and XBeta(α,β), 我们称 Xα,β 控制的 beta 分布。α,β>0 and are both real number, a.k.a “shape parameters”.

PDF is defined as f(x|α,β)=xα1(1x)β1B(α,β)

where constant B(α,β)=01tα1(1t)β1dt

or with Gamma Funtion Γ(n)={(n1)! n is a positive integer 01tα1(1t)β1dt n is real or complex 

we can rewrite B(α,β)=Γ(α)Γ(β)Γ(α+β)

If α=2 and β=1, then B(α,β)=1!×0!2!=0.5 or B(α,β)=01tdt=12t2|01=0.5, therefore f(x)=2x and F(x)=x2.

我们在 R 中执行 pbeta(0.75, 2, 1) 会得到 0.5625。根据 R Generating Random Numbers and Random Sampling 里总结的规律,p 开头的都是求 CDF,i.e. F(x),所以这里 pbeta(0.75, 2, 1) 的意义就是:当 α=2 and β=1 时,求 F(0.75)。最终得到 F(0.75)=0.5625。这和我们用 f(x) 的面积来算的结果是一致的:

> x <- c(0, 1, 1, 1.5)
> y <- c(0, 2, 0, 0)
> plot(x, y, lwd = 3, frame = FALSE, type = "l", ylab="f(x)")
> abline(h=1.5, v=0.75)

F(0.75)=1.5×0.752=0.5625

Comments