What is an ANOVA? Types, Assumptions, and Uses


Reading time
8 mins
 What is an ANOVA? Types, Assumptions, and Uses

Analyzing and interpreting statistical data can feel overwhelming, and tests like the analysis of variance (ANOVA) often come in handy. Here’s a succinct guide on using ANOVAs for your research data analysis.

In this article, you’ll learn

 

What is an ANOVA?

ANOVA, short for ANalysis OVAriance, is used to examine the difference between the mean values of multiple groups of data. ANOVA testing evaluates the overall variance within and between groups rather than testing individual differences. This test is commonly used in the fields of biomedical sciences, education, and market research, among others.

When do you use an ANOVA?

You run an ANOVA when there are three or more groups of data to be analyzed.

For instance, if three types of teaching methods (recorded video courses, classroom teaching, and online classes) are considered to examine student performance and their corresponding exam scores, an ANOVA test should be used.

 

How do you interpret the results of an ANOVA?

When you’re interpreting the results of an ANOVA, you mainly focus on the F statistic, degrees of freedom, and the p value, and you should also consider effect size.

What is the F statistic in an ANOVA?

The F statistic is basically a score that answers the question: “Are these groups actually different, or just randomly varying?”

F is a simple ratio.

F = (How spread apart the group averages are) ÷ (How spread out the data is within each group)

Think of it like a signal-to-noise ratio:

  • Top (signal): How different are the group averages from each other?
  • Bottom (noise): How much natural random variation exists within each group?

Example

You can understand the F statistic from the following example:

You are testing whether three study methods affect test scores. You have groups A, B, and C.

  • If Groups A, B, and C have a mean score of 73, 74, and 75 respectively → the groups look the same → small F
  • If group A has a mean score of 60, B has a mean score of 75, C has a mean score of 90 → the groups look very different → large F

BUT you also have to account for the fact that scores within each group aren’t identical. Some students in A have scored more than 60 while some in C have scored below 90. That’s the “noise” in the denominator.

 

What the F statistic tells you

  • If F is close to 1 → The differences between groups are about what you’d expect from random chance alone. Nothing interesting going on.
  • F is a lot more than 1 → The groups are more different than random chance would predict. Something real might be going on.

But remember, a big F just tells you that at least one group is different. It doesn’t tell you which group was different. You’d need follow-up tests to figure that out.

 

What are degrees of freedom in an ANOVA?

Degrees of freedom (df) is the number of values in your data that are free to vary once certain constraints are in place.

In ANOVA, there are two types:

  • Between-groups df: this is the number of groups minus 1. For 3 groups: df = 2
  • Within-groups df: this is the total number of participants minus the number of groups

They appear in your F statistic as F(between df, within df), for example, F(2, 34). Essentially, they tell the reader how much data was behind your result.

 

 

How to interpret the p value in an ANOVA

The p-value answers one specific question:

“If there were truly no difference between the groups, how likely would I be to see results this extreme just by random chance?”

P is a probability, and its value ranges from 0 to 1.

  • A p value of 0.5 means that there is a 50% chance that you’d get those ANOVA results just by random chance.
  • A p value of 0.05 means that there is a 5% chance you’d get those ANOVA results just by random chance.

Researchers usually use a cut off of .05, called the significance threshold. If p < .05, the results are called “statistically significant.”

 

Are the p value and F statistic linked?

Generally, when you have a big F statistic, your p value tends to be below .05. But if p is below .05, it doesn’t mean that your difference is big enough to be important. A tiny, unimportant difference can still be statistically significant if you have a large enough sample size.

 

What is effect size in an ANOVA? Why is effect size important?

Statistical significance without a meaningful effect size is like winning a race by one millimeter; you have come first but the person who came second isn’t that much slower than you.

In other words, the p-value tells you whether a difference exists but effect size tells you how big that difference actually is. A study with thousands of participants can get p < 0.05 for a trivially small difference. So if you want your results to be convincing, you must also calculate and report effect size.

What measures of effect size are there for an ANOVA?

Eta Squared (η²)

Imagine 100 students’ test scores vary all over the place. Eta squared asks: how much of that variation is because of which study method they used, versus just random differences between students?

  • η² = 0.01 means that group membership explains 1% of the variation (small effect)
  • η² = 0.06 means that group membership explains 6% of the variation (medium effect)
  • η² = 0.14 means that explains 14% of the variation (large effect)

Partial Eta-squared (η²p)

This is the measure of effect size that SPSS and most software report by default, so you’ll see it constantly in research papers. It’s interpreted similar to η² though it tends to be larger.

Omega Squared (ω²)

η² tends to be slightly inflated (optimistic), especially with small samples. ω² adjusts for that bias, giving a more accurate picture of the true population effect. Researchers who want to be rigorous prefer this one. Just like eta-squared, omega-squared can be interpreted as follows:

  • 01 → Small
  • 06 → Medium
  • 14 → Large

 

Why do you need to run post hoc tests after an ANOVA?

When ANOVA gives you a significant result, all it tells you is that somewhere among your groups, something is different. It doesn’t tell you where.

What kind of post hoc tests should you run after an ANOVA?

The most commonly used post hoc tests after an ANOVA are

  • Tukey’s test: the most common, good for comparing all pairs of groups
  • Bonferroni correction: very conservative, best when you have few comparisons
  • Games-Howell: it is used when your groups have unequal sizes or variability

 

What are the different types of ANOVA?

One-way ANOVA

Here, the means of three of more groups are compared based on a single independent variable. You can use this test to determine if there are any significant differences between the averages of the groups.

Example

Let “teaching method” be the independent variable categorized into recorded video courses, classroom teaching, and online classes. Your research objective is to determine if there is a significant difference in the dependent variable (exam scores).

  • Independent variable: Teaching method (recorded video courses/classroom teaching/online classes).
  • Dependent variable:Exam scores.
  • Analysis: The one-way ANOVA testing helps determine if the exam scores vary significantly from each other when students learn through three different modes. The results can aid in identifying the best teaching method to be adopted in schools.

Two-way ANOVA

A two-way ANOVA test is used to compare the means of three or more groups by considering two independent variables.

Example

You can use ”gender” and “age” as independent variables to investigate the effect of a new weight loss drug (dependent variable) in the market.

  • Independent variables:Gender (male/female) and age (young/middle-aged/elderly).
  • Dependent variable:Amount of weight lost (pounds or kilograms) after using the drug.
  • Analysis: The two-way ANOVA tests whether a significant difference exists in the amount of weight lost considering the various levels of gender and age independently. Furthermore, you can evaluate if the two independent variables together influence the outcome: for instance, the drug may be less effective in elderly women than in younger men.

Three-way ANOVA

Here, the means of groups are compared based on three independent variables simultaneously. In this way, you can examine not only the individual effect of each variable on the dependent variable, but also the interaction effects between variables. In other words, you can tell whether the effect of one independent variable changes depending on the levels of the other two. This makes the three-way ANOVA a considerably more complex but analytically powerful extension of the one-way and two-way designs.

Example

We shall use teaching method, study environment, and sleep duration as the three independent variables. Teaching method is categorized into recorded video courses, classroom teaching, and online classes. Study environment is categorized into studying at home, in a library, or in a study group. Sleep duration is categorized into less than 6 hours, 6–8 hours, and more than 8 hours per night. Your research objective is to determine if there is a significant difference in the dependent variable (exam scores), and whether the effect of any one variable depends on the levels of the others.

  • Independent variable 1: Teaching method (recorded video courses / classroom teaching / online classes).
  • Independent variable 2: Study environment (at home / in a library / in a study group).
  • Independent variable 3: Sleep duration (less than 6 hours / 6-8 hours / more than 8 hours).
  • Dependent variable: Exam scores.
  • Analysis: The three-way ANOVA tests for three main effects (the individual influence of teaching method, study environment, and sleep duration on exam scores), three two-way interaction effects (teaching method × study environment; teaching method × sleep duration; study environment × sleep duration), and one three-way interaction effect (teaching method × study environment × sleep duration). For instance, it may reveal that classroom teaching produces the highest exam scores, but only among students who sleep more than 8 hours and study in a library. This is an insight that neither a one-way nor a two-way ANOVA would be capable of detecting. Such findings can inform holistic, evidence-based recommendations for optimizing student academic performance across multiple dimensions simultaneously.

 

 

Repeated measures ANOVA

Unlike the other types of ANOVA, which compare different groups against each other, a repeated measures ANOVA compares the same people measured multiple times.

Example:

You can use “time points” as your repeated measure to investigate the effect of a new anti-inflammatory drug on joint pain (dependent variable) in patients with rheumatoid arthritis.

  • Independent variable (repeated measure): Time, i.e., pain levels measured at baseline (before treatment), at 4 weeks, and at 8 weeks after starting the drug.
  • Dependent variable: Self-reported joint pain score (on a scale of 0 to 10, where 0 = no pain and 10 = worst possible pain).
  • Analysis: The repeated measures ANOVA tests whether a significant difference exists in joint pain scores across the three time points in the same group of patients. Because the same patients are measured at each time point, the test removes natural variation between patients, such as differences in disease severity or pain tolerance. The repeated measures ANOVA instead focuses purely on whether pain levels changed over the course of treatment.

 

What are the assumptions of an ANOVA?

The assumptions of ANOVA are as follows:

  • Normality: The scores within each group should follow a bell curve (normal distribution). This means most people score somewhere in the middle, with only a few scoring very high or very low. For example, if you’re measuring test scores, most students should cluster around the average, not all bunch up at the extremes.
  • Homogeneity of variance: Each group should have a similar spread of scores. In other words, one group shouldn’t have wildly inconsistent scores while another group is very consistent. Imagine comparing three classes. If Class A’s scores range from 55 to 95, Class B’s from 60 to 90, and Class C’s from 20 to 100, that last group is way more spread out, which can throw off the analysis.
  • Independence: The observations within each group should be independent. One person’s score should not influence another person’s score. If students are copying off each other, or patients in the same household are affecting each other’s results, the scores are no longer independent.
  • Random sampling: Participants should be randomly selected from the population, not handpicked. If you only recruit the healthiest patients for a drug trial, your results will be biased  and your conclusions won’t generalize to the real world.

 

How do I report an ANOVA in my research paper?

When describing your ANOVA in your research paper, you should always include:

  • Number of groups being compared
  • Sample size for each group
  • Mean and standard deviation for each group
  • F statistic and p-value: the core results of your ANOVA
  • Which post-hoc tests you used (e.g. Tukey’s test)
  • Effect size, such as eta-squared (η²), partial eta-squared (η²p), or omega-squared (ω²)
  • Degrees of freedom: many journals require this, so check your target journal’s guidelines

Example

F(2, 34) = 2.51, p = .003, η² = .04

  • F(2, 34): the F statistic, with 2 and 34 being the degrees of freedom
  • p = .003: the result is statistically significant
  • η² = .04: a small effect size

F and p may need to be italicized depending on the guidelines of your target journal.

What are the limitations of an ANOVA?

An ANOVA is a powerful statistical tool, but it comes with certain limitations too:

  • It only tells you that a difference exists, not where it is. If you compare three groups and get a significant result, ANOVA just says “something is different somewhere” but it won’t tell you which groups differ from each other. You need post-hoc tests (like Tukey’s test) to figure that out.
  • It can’t handle data that isn’t roughly bell-shaped. ANOVA assumes your data follows a normal distribution. If your data is heavily skewed (e.g., most people scoring very low with a few outliers scoring extremely high), the results may not be trustworthy.
  • It assumes all groups have a similar spread of scores. If one group’s scores are all over the place while another group’s are tightly clustered, your ANOVA can produce unreliable results.
  • Outliers can seriously mess up your results. A single extreme value (e.g., one patient reporting 10x the pain of everyone else) can distort the entire analysis, making a real effect harder or easier to detect.
  • It only works with one dependent variable at a time. If you want to measure both pain levels and mobility scores in arthritis patients at the same time, a regular ANOVA can’t do that. You’d need a more advanced test called a MANOVA.
  • It tells you about averages, not individuals. ANOVA compares group means, so it can miss important patterns in how individuals respond. For example, a drug might substantially reduce VLDL cholesterol for the men in your sample but reduce VLDL cholesterol only a little in women, but if the average looks decent, ANOVA might still show a “significant” effect.
  • Requires a reasonably large sample size. With very few participants, ANOVA lacks the statistical power to detect real differences. In other words, you might miss a genuine effect simply because your group sizes were too small.
  • It doesn’t tell you how meaningful the difference is. A statistically significant result doesn’t automatically mean the difference matters in real life. That’s why you always need to report an effect size alongside your ANOVA result.

 

What are alternatives for an ANOVA if my data is non-parametric?

If your data doesn’t meet ANOVA’s assumptions (like normality), use these instead:

Non-Parametric Test Replaces Use When
Kruskal-Wallis Test One-way ANOVA Comparing 3+ independent groups
Friedman Test Repeated Measures ANOVA Same participants measured multiple times
Mann-Whitney U Test Independent samples t-test Comparing only 2 groups

 

All three tests rank the data instead of using raw scores, making them more robust when your data is skewed or has outliers.

 

What’s the difference between an ANOVA and a t-test?

The main difference between an ANOVA and a t-test is that an ANOVA is used for 3 or more groups whereas a t-test is used for exactly 2 groups.

What’s the difference between ANOVA and MANOVA?

An ANOVA measures the effect of your groups on one outcome variable whereas a MANOVA (multivariate analysis of variance) measures the effect of your groups on multiple outcome variables at the same time. Take a look at the table below:

ANOVA MANOVA
How many outcome variables? One at a time Multiple simultaneously
Example Does a new drug reduce joint pain scores? Does a new drug reduce joint pain scores, improve mobility, and lower inflammation markers (all in the same study)?
Why use it? You are focusing on one outcome You want to study several outcomes together without increasing the risk of false positives
Complexity Simpler to run and interpret More complex, needs larger sample sizes

 

 

What’s the difference between ANOVA and ANCOVA?

An ANOVA compares group differences in an outcome variable, whereas an ANCOVA (analysis of covariance) does the same thing, but statistically controls for an extra variable that might be influencing your results. That extra variable is called a covariate. A covariate something you’re not directly interested in, but you know it could be affecting your outcome and want to account for it.

Imagine you’re comparing three groups of atopic dermatitis patients receiving different creams. But one group happens to have much milder disease to begin with. So their skin naturally looks better regardless of which cream they use. That difference in baseline severity could make one cream look more effective than it really is.

ANCOVA lets you level the playing field by statistically removing the effect of baseline severity, so that you can see if the cream alone is making a difference.

 

This article was originally published on July 9, 2025, and updated on May 12, 2026.

 

Found this useful?

If so, share it with your fellow researchers


Related post

Related Reading