Chi-Square Test for Independence
总结自:
- Chi: [kaɪ]
其实还有 Chi-square test for variance in a normal population 以及 Chi-squared distribution,这里不涉及。
What is a chi-square testPermalink
A chi-square test is also referred to as \chi
).
The test is applied when you have two categorical variables from a single population. It is used to determine whether these two categorical variables are independent.
Digress: What is a categorical variable?Permalink
Variables can be classified as categorical (aka, qualitative) or quantitative (aka, numerical).
- Categorical variables take on values that are names or labels. E.g.
- the color of a ball (red, green, blue, etc.)
- the breed of a dog (collie, shepherd, terrier, etc.)
- Quantitative variables represent a measurable quantity. E.g.
- the population of a city
When to Use Chi-Square Test for IndependencePermalink
The test procedure is appropriate when the following conditions are met:
- The sampling method is simple random sampling.
- The variables under study are each categorical.
- If sample data are displayed in a contingency table, the expected frequency count for each cell of the table is at least 5.
State the HypothesesPermalink
Given variable
: variable and variable are independent. : variable and variable are not independent.
Analyze Sample DataPermalink
- Degrees of freedom:
- Expected frequencies:
is the expected frequency count for level of variable and level of variable is the total number of sample observations at level of variable is the total number of sample observations at level of variable is the total sample size
- Test statistic:
is the observed frequency count for level of variable and level of variable
- p-value: 计算时需要
和 两个值,可以使用 Chi-Square Calculator: Online Statistical Table
ExamplePermalink
Question: Is there a gender gap? Do the men’s voting preferences differ significantly from the women’s preferences?
: “Gender” and “Voting Preference” are independent. : “Gender” and “Voting Preference” are not independent.
查表得
Since the p-value (0.0003) is less than the significance level (0.05), we cannot accept the null hypothesis. Thus, we conclude that there is a relationship between gender and voting preference.
Comments