What is a Chi-Square Test? Types, Formula & Examples

Get Published
Getting your Trinity Audio player ready...

The chi-square test (χ², pronounced “kai-square”) is specifically designed for categorical data: data that can be sorted into distinct groups or categories rather than measured on a numerical scale. Whether you want to check if your sample matches an expected distribution or investigate whether two categorical variables are related, the chi-square test provides a straightforward and robust method to answer these questions.

In this article, we cover the definition, types, formula, assumptions, step-by-step examples, and reporting guidelines for chi-square tests.

Contents

  1. What is a Chi-Square Test?
  2. Types of Chi-Square Tests
  3. The Chi-Square Formula
  4. Assumptions of the Chi-Square Test
  5. Step-by-Step Examples
  6. The Chi-Square Distribution
  7. Chi-Square Critical Value Table (Selected Values)
  8. How to Perform a Chi-Square Test: Step-by-Step
  9. Chi-Square Test vs. Other Statistical Tests
  10. The Null Hypothesis in Chi-Square Tests
  11. Interpreting Chi-Square Results
  12. Chi-Square Test in Medical and Biomedical Research
  13. Yates’ Correction for Continuity
  14. Limitations of the Chi-Square Test
  15. How to Report Chi-Square Results
  16. Frequently Asked Questions
  17. Summary

What is a Chi-Square Test?

A chi-square test is a non-parametric hypothesis test that uses the chi-square (χ²) statistic to evaluate whether observed frequencies in categorical data differ significantly from expected frequencies. In other words, it tells you whether any differences between your groups are due to a real association or simply due to chance.

The symbol χ² is the Greek letter chi squared. The test belongs to the family of parametric and non-parametric tests, but unlike t-tests or ANOVA, it does not require normally distributed data and works with count or frequency data.

Key Characteristics

  • Works with categorical (nominal or ordinal) variables
  • Based on the comparison of observed versus expected frequencies
  • Does not require the assumption of normality
  • Results in a test statistic (χ²) that follows a chi-square distribution
  • The p-value derived from this statistic is compared against a chosen significance level (e.g., α = 0.05)

Types of Chi-Square Tests

There are three main types of chi-square tests used in research. The appropriate test depends on your study design and the number of variables you are examining.

Test TypeNumber of VariablesPurposeExample
Goodness of FitOne categorical variableDoes observed distribution match expected distribution?Are M&M colors equally distributed in a bag?
Test of IndependenceTwo categorical variablesAre two variables related/associated in a population?Is smoking status related to lung disease diagnosis?
Test of HomogeneityTwo or more groupsDo two or more populations have the same distribution?Do patients in two hospitals have the same blood type distribution?

1. Chi-Square Goodness of Fit Test

The goodness of fit test is used when you have one categorical variable and want to determine whether the observed distribution of that variable matches a hypothesized (expected) distribution. You are essentially asking: “Does my sample reflect what I would expect?”

When to use:

When comparing a single categorical variable against a theoretical or known distribution.

Example:

A die is rolled 60 times. You want to test whether each face appears equally (expected: 10 times each). The goodness of fit test tells you if the observed outcomes significantly deviate from the expected equal frequencies.

2. Chi-Square Test of Independence

This is the most commonly used chi-square test. It examines whether two categorical variables are independent of each other (i.e., not associated) in a single population. Data is typically arranged in a contingency table (cross-tabulation).

When to use:

When examining the relationship between two categorical variables measured on the same subjects.

Example:

Is there an association between gender (male/female) and preference for a political party (Party A/Party B/Party C)? This test determines if gender and party preference are independent or related.

3. Chi-Square Test of Homogeneity

The test of homogeneity is used to compare the distribution of a categorical variable across two or more independent groups. While it uses the same formula as the test of independence, the key difference lies in the study design: in the test of homogeneity, you sample separately from each group.

When to use:

When comparing the distribution of a categorical outcome across multiple predefined groups or populations.

Example:

Do patients from three different hospitals have the same distribution of blood types (A, B, AB, O)?

Independence vs. Homogeneity: Key Difference

FeatureTest of IndependenceTest of Homogeneity
SamplingOne sample from one populationSeparate samples from multiple populations
QuestionAre two variables associated?Do groups have the same distribution?
FormulaSame χ² formulaSame χ² formula

The Chi-Square Formula

The chi-square statistic is calculated using the following formula:

χ² = Σ [ (O − E)² / E ]

SymbolMeaning
χ²Chi-square test statistic
ΣSummation across all categories or cells
OObserved frequency, the actual count recorded in each category
EExpected frequency, the count expected under the null hypothesis

The larger the difference between observed and expected frequencies, the larger the χ² statistic, and the more likely the result is statistically significant. This value is then compared to a critical value from the chi-square distribution based on degrees of freedom (df) and the chosen significance level (α).

Degrees of Freedom

Degrees of freedom determine which chi-square distribution to use when looking up critical values.

Test TypeDegrees of Freedom FormulaNotation
Goodness of FitNumber of categories − 1df = k − 1
Independence / Homogeneity(Rows − 1) × (Columns − 1)df = (r−1)(c−1)

Calculating Expected Frequencies

For the test of independence, expected frequencies for each cell in a contingency table are calculated as:

E = (Row Total × Column Total) / Grand Total

For the goodness of fit test, expected frequencies are derived from the hypothesized proportions multiplied by the total sample size.

Assumptions of the Chi-Square Test

To ensure valid results, the following assumptions must be met before applying a chi-square test. Violations can lead to misleading conclusions, so checking these in your methods section is important:

AssumptionDetails
Categorical dataBoth variables must be categorical (nominal or ordinal). The chi-square test is not appropriate for continuous numerical data.
Independent observationsEach observation must be independent, one subject should contribute to only one cell in the table.
Mutually exclusive categoriesEach observation must belong to only one category. Categories cannot overlap.
Expected frequency ≥ 5At least 80% of cells should have expected frequencies of 5 or more. If this is violated, consider combining categories or using Fisher’s Exact Test.
Adequate sample sizeA sufficiently large sample is needed. Very small samples make the test unreliable. See guidance on sample size for details.
Random samplingData should be collected through random or representative sampling to allow valid inference to the population.

Note: When expected cell counts are less than 5, Yates’ continuity correction (for 2×2 tables) or Fisher’s Exact Test may be more appropriate.

Step-by-Step Examples

Example 1: Chi-Square Goodness of Fit Test

Research question: Is a six-sided die fair? After 60 rolls, do the observed frequencies match the expected equal distribution?

Null hypothesis (H₀): Each face has an equal probability of appearing (p = 1/6 for each face).

Alternative hypothesis (H₁): The die is not fair; at least one face appears with a different frequency.

Significance level: α = 0.05

FaceObserved (O)Expected (E)(O−E)²(O−E)²/E
181040.40
291010.10
3111010.10
41410161.60
591010.10
691010.10
Total6060 χ² = 2.40

Degrees of freedom: df = 6 − 1 = 5

Critical value (χ² at df=5, α=0.05): 11.07

Result: χ² = 2.40 < 11.07 → Fail to reject H₀. There is no significant evidence that the die is unfair.

Example 2: Chi-Square Test of Independence

Research question: Is there an association between a new drug treatment and recovery outcome in 200 patients?

Null hypothesis (H₀): Treatment type and recovery outcome are independent.

Alternative hypothesis (H₁): Treatment type and recovery outcome are associated.

Observed Contingency Table:

 RecoveredNot RecoveredRow Total
Drug Treatment9010100
Placebo6040100
Column Total15050200

Expected Frequencies (E = Row Total × Column Total / Grand Total):

 RecoveredNot Recovered(O−E)²/E
Drug TreatmentE = 75E = 253.0 + 9.0 = 12.0
PlaceboE = 75E = 253.0 + 9.0 = 12.0
χ² Total  χ² = 24.0

Degrees of freedom: df = (2−1)(2−1) = 1

Critical value (χ² at df=1, α=0.05): 3.84

Result: χ² = 24.0 >> 3.84 → Reject H₀. There is a significant association between treatment type and recovery outcome (p < 0.05).

The Chi-Square Distribution

The chi-square distribution is a continuous probability distribution that arises when independent standard normal variables are squared and summed. It is right-skewed and takes only non-negative values. Key properties:

  • The shape depends on the degrees of freedom (df)
  • With small df, the distribution is very right-skewed; it becomes more symmetric as df increases
  • The mean of a chi-square distribution equals its degrees of freedom
  • As df increases, the distribution approaches a normal distribution

A larger χ² statistic indicates a greater discrepancy between observed and expected values. To determine statistical significance, the computed χ² value is compared against the critical value from a chi-square table at the relevant df and α level.

Chi-Square Critical Value Table (Selected Values)

The table below shows commonly used critical values for the chi-square distribution at α = 0.05 and α = 0.01:

Degrees of Freedom (df)α = 0.10α = 0.05α = 0.01
12.7063.8416.635
24.6055.9919.210
36.2517.81511.345
47.7799.48813.277
59.23611.07015.086
610.64512.59216.812
813.36215.50720.090
1015.98718.30723.209

How to Perform a Chi-Square Test: Step-by-Step

Whether you are using a chi-square goodness of fit test or a test of independence, the general process follows these steps. It is advisable to consult a biostatistician if you are unsure which test applies to your data:

  1. Define your hypotheses: State the null hypothesis (H₀) and alternative hypothesis (H₁) clearly before collecting data.
  2. Choose your significance level (α): Typically α = 0.05, though α = 0.01 or 0.10 may be used in specific contexts.
  3. Collect and organize your data: Create a frequency table or contingency table for your categorical variables.
  4. Calculate expected frequencies: Use the formula E = (Row Total × Column Total) / Grand Total for the test of independence, or E = n × p for goodness of fit.
  5. Check assumptions: Confirm all expected cell frequencies are ≥ 5 and observations are independent.
  6. Compute the χ² statistic: Apply χ² = Σ [(O − E)² / E] across all categories or cells.
  7. Determine degrees of freedom: df = k−1 (goodness of fit) or df = (r−1)(c−1) (independence/homogeneity).
  8. Find the critical value or p-value: Compare your χ² statistic to the critical value from the chi-square table, or obtain a p-value using statistical software.
  9. Draw your conclusion: If χ² > critical value (or p < α), reject H₀. Otherwise, fail to reject H₀.

Chi-Square Test vs. Other Statistical Tests

Choosing the right test is important. The following comparison helps distinguish the chi-square test from other commonly used tests. For further guidance, see the overview of choosing the right statistical test:

TestData TypeGroupsPurposeKey Assumption
Chi-squareCategorical2+Association / distributionExpected freq ≥ 5
T-testContinuous1 or 2Compare meansNormality, equal variance
ANOVAContinuous3+Compare meansNormality, homogeneity of variance
Fisher’s ExactCategorical2Association (small samples)Small samples (2×2 table)
McNemar’sCategorical2Paired categorical dataMatched pairs

For continuous data with two groups, consider a t-test. For three or more groups, ANOVA is preferred. For correlation between continuous variables, see correlation analysis or regression.

The Null Hypothesis in Chi-Square Tests

Every chi-square test begins with a clearly stated null hypothesis. Understanding what the null hypothesis means is crucial to interpreting your results correctly.

Test TypeNull Hypothesis (H₀)Alternative Hypothesis (H₁)
Goodness of FitThe observed distribution matches the expected distributionThe observed distribution does not match the expected distribution
Test of IndependenceThe two variables are independent (no association)The two variables are not independent (there is an association)
Test of HomogeneityThe distribution is the same across all groupsThe distribution differs across at least one group

Rejecting H₀ does not tell you the magnitude or direction of the association — only that a statistically significant difference or relationship exists. Consider calculating effect sizes (such as Cramér’s V or phi) to quantify the practical significance of your findings.

Interpreting Chi-Square Results

The p-Value

The p-value tells you the probability of obtaining a test statistic as extreme as yours, assuming the null hypothesis is true.

  • p < α (typically 0.05): Reject H₀  (statistically significant result)
  • p ≥ α: Fail to reject H₀ (not statistically significant)

Effect Size: Cramér’s V

Statistical significance alone does not convey the strength of an association. Cramér’s V is the most commonly used effect size measure for chi-square tests:

V = √[ χ² / (n × (min(r,c) − 1)) ]

Where n is the total sample size, r is the number of rows, and c is the number of columns.

Cramér’s V ValueInterpretation
0.10 – 0.19Small effect
0.20 – 0.29Medium effect
≥ 0.30Large effect

Confidence Intervals

While the chi-square test provides a p-value, reporting a confidence interval (e.g., for proportions or odds ratios derived from the contingency table) provides additional context about the precision of the estimate.

Chi-Square Test in Medical and Biomedical Research

The chi-square test is among the most frequently reported statistical tests in clinical trials and biomedical publications. Common applications include:

  • Comparing treatment outcomes (recovered vs. not recovered) across two patient groups
  • Assessing whether adverse events are equally distributed among drug dosage groups
  • Examining the association between a risk factor (e.g., smoking) and a disease outcome (e.g., lung cancer)
  • Analyzing baseline characteristics in randomized controlled trials to confirm comparability of groups
  • Evaluating screening test performance in terms of sensitivity and specificity categories

 Important note: In cross-sectional studies and case-control studies, chi-square tests are commonly used to examine associations. However, they cannot establish causality, only association.

Yates’ Correction for Continuity

For 2×2 contingency tables, particularly when sample sizes are small, Yates’ continuity correction is applied to reduce the overestimation of statistical significance:

χ²(Yates) = Σ [ (|O − E| − 0.5)² / E ]

Yates’ correction makes the test more conservative, reducing the risk of Type I errors (false positives). When the expected cell count in any cell of a 2×2 table is less than 5, Fisher’s Exact Test is generally preferred over chi-square with or without correction.

Limitations of the Chi-Square Test

While the chi-square test is widely applicable, it has limitations researchers should be aware of:

  • Cannot establish causality; it only tests association or distribution fit
  • Sensitive to sample size: very large samples can produce statistically significant chi-square values even for trivial associations
  • Requires adequate expected cell frequencies: cells with E < 5 can produce unreliable results
  • Does not indicate the direction or strength of an association without supplementary measures (e.g., Cramér’s V, odds ratio)
  • Not suitable for continuous data but only categorical variables
  • Affected by outliers in the sense that unusual distributions in one cell can dominate the overall chi-square value

 When small expected frequencies are a concern, consider Fisher’s Exact Test (2×2 tables) or collecting more data. For count data more generally, reviewing statistical tests for count data may be helpful.

How to Report Chi-Square Results

In academic manuscripts, chi-square results should be reported in the results section with all relevant statistics. Standard reporting format:

χ²(df) = [value], p = [value], N = [sample size]

Example: “There was a significant association between treatment type and recovery outcome, χ²(1) = 24.0, p < .001, N = 200, Cramér’s V = 0.35.”

Elements to include in your report:

  • The chi-square statistic (χ²)
  • Degrees of freedom in parentheses
  • The exact p-value (use p < .001 when appropriate)
  • Sample size (N)
  • Effect size measure (e.g., Cramér’s V, phi coefficient)
  • A frequency or contingency table in the results

 When reporting, follow the guidelines of your target journal (APA, Vancouver, etc.). For guidance on reporting p-values correctly and using descriptive statistics to contextualize your findings, refer to established resources.

Frequently Asked Questions

When should I use a chi-square test instead of a t-test?

Use a chi-square test when your outcome variable is categorical (e.g., yes/no, blood type, disease status). Use a t-test when your outcome variable is continuous (e.g., height, blood pressure) and you are comparing means between two groups.

Can I use a chi-square test for ordinal data?

Technically yes, but the chi-square test ignores the ordering of ordinal categories. Other tests (such as the Cochran-Armitage trend test or Spearman’s rank correlation) may be more appropriate for ordered categorical data.

What if my expected cell count is less than 5?

If any expected frequency falls below 5, the chi-square approximation may not be valid. Consider: (1) merging categories to increase expected counts, (2) using Fisher’s Exact Test for 2×2 tables, or (3) collecting more data.

What is the difference between the chi-square test and a z-test for proportions?

For a 2×2 contingency table, the chi-square test of independence and the two-proportion z-test are mathematically equivalent (χ² = z²). The chi-square test is more general and applicable to tables larger than 2×2.

Does a significant chi-square result prove causation?

No. A chi-square test can identify a statistically significant association but cannot establish causality. Causal inference requires careful study design, control for confounders, and often prospective or experimental methods.

How do I report chi-square in APA style?

In APA style, report as: χ²(df, N = sample size) = chi-square value, p = p-value. Example: χ²(2, N = 150) = 8.45, p = .015.

Summary

The chi-square test is an essential tool for researchers working with categorical data. Here is a quick reference summary:

FeatureDetails
Type of dataCategorical (nominal or ordinal)
Main testsGoodness of fit, Test of independence, Test of homogeneity
Formulaχ² = Σ [(O − E)² / E]
Degrees of freedomk−1 (goodness of fit); (r−1)(c−1) (independence/homogeneity)
Key assumptionExpected frequency ≥ 5 in each cell
Effect sizeCramér’s V, phi coefficient
Alternative (small samples)Fisher’s Exact Test (2×2 tables)
SoftwareSPSS, R, SAS, Stata, Python (scipy.stats.chi2_contingency)

For additional support with your statistical analysis or manuscript preparation, consider working with expert statistical consultants who can help you choose the right test, interpret results accurately, and present findings clearly in your paper.

This article was originally published on March 3, 2023, and updated on June 4, 2026.

Related post

Featured post

Comment

There are no comment yet.

TOP