Applied Nonparametric and Modern Statistics
Course site: Applied Nonparametric and Modern Statistics
1. IntroductionPermalink
A common scenario in applied statistics is that one has an independent variable or outcome
Statisticians usually assume that
It should be noted that for some designed experiments it does not make sense to assume the
How do we learn about
- Linear Regression:
- Generalized Linear Model (GLM):
is called a link function. We can also write- It is typical to assume the conditional distribution of
is part of an exponential family, e.g. binomial, Poisson, gamma, etc. - Many times the link function is chosen for mathematical convenience.
Linear Models Pros:
- Having the convenience that the parameters
usually have direct interpretation with scientific meaning. - Once an appropriate model is in place, the estimates have many desirable properties.
Linear Models Cons:
- These models are quite restrictive. Linearity and additivity are two very strong assumptions. This may have practical consequences.
- E.g., by assuming linearity one may never notice that a covariate has an effect that increases and then decreases.
- By relaxing assumptions we loose some of the nice properties of estimates. There is an on going debate about specification vs. estimation.
In this class we will:
- Start by introducing various smoothers useful for smoothing scatter plots
where both and are continuous variables. - Set down precise models and outline the proofs of asymptotic results.
- Introduce local regression (loess).
- Examine spline models and some of the theory behind splines.
- Some smoothers are more flexible than others. However with flexibility comes variance. We will talk about the bias-variance trade-off and how one can use resampling methods to estimate bias and variance.
- After explaining all these smoothers we will make a connection between them. We will also make connections to other statistical procedures.
- We will examine the case were one has many covariates. One can relax the linearity assumption, assume additivity and use additive models. One can also forget the additivity assumption and use regression trees.
- After all this we will be ready to consider the case where is not necessarily continuous. We will generalize to this case and look at Generalized Additive Models and Local Likelihood.
- While examining all these subjects we will be considering various models for one data set. We will briefly discuss techniques that can be used to aid in the choice of such models.
- Finally we will look at a brief introduction of times series analysis.
We will begin the class talking about the case were the regression function
The data to support such investigations are typically a set of
So once we have the data what do we do?
If we are going to “model”
- identify the width and height of peaks
- explore the overall shape of
in some neighborhood - find areas of sharp increase or regions exhibiting little curvature.
We will then move on to the case where we have many covariates, then cases where the expectation needs to be transformed, and various other generalization.
Comments