Using a Within-Subjects Design in Research: Steps, Tips, Examples

Getting your Trinity Audio player ready...

Contents

Introduction

In an experiment, a different treatment or manipulation of the independent variable is applied in each condition to test for a cause-and-effect relationship with a dependent variable. In a within-subjects design (also called a within-groups, repeated measures, or dependent-groups design), every participant takes part in every condition. Researchers compare related measurements taken from the same people across conditions or across time.

It is the opposite of a between-subjects design, in which each participant experiences only one condition. The word within signals that conditions are compared within the same group or individual, whereas between signals comparison between separate groups. Because the same people appear in every condition, each participant effectively serves as their own control, and all longitudinal studies, which track the same individuals over time, use a within-subjects design.

This guide defines the design, illustrates it with examples from the social sciences and biomedical sciences, explains counterbalancing, and weighs its considerable statistical advantages against its threats to internal validity.

Glossary of Key Terms

TermDefinition
Within-subjects designAn experimental design in which all participants are exposed to every condition, and related measures from the same people are compared across conditions.
Repeated measuresMultiple measurements of the dependent variable taken from the same participant, across conditions or time points.
Independent variable (IV)The variable that is manipulated (e.g., message style, drug dose) or that defines the comparison (e.g., time).
Dependent variable (DV)The outcome that is measured (e.g., willingness to donate, blood pressure, test scores).
Own control (baseline)Each participant’s scores in one condition (often a pretest) act as the comparison point for their scores in other conditions; no separate control group is needed.
Carryover effectsEffects of an earlier condition that spill over and alter responses in a later condition.
Practice (learning) effectImprovement in later conditions caused by familiarity gained in earlier conditions.
Order effectA change in outcomes caused by a condition’s position in the sequence (e.g., poorer attention in the final condition due to fatigue or boredom).
Sequence effectAn interaction between conditions based on their order (e.g., rating later items by comparing them with earlier ones).
CounterbalancingPresenting conditions in a limited set of fixed sequences distributed evenly across participants to balance out order effects.
Randomization (of order)Presenting conditions in randomly generated sequences that vary across participants.
Washout periodIn crossover trials, a treatment-free interval between conditions that lets the first treatment’s effects dissipate.
AttritionLoss of participants over the course of a study, which can bias longitudinal results.
Statistical powerThe probability of detecting a true effect; within-subjects designs gain power by removing between-person variability.
Longitudinal studyA study that repeatedly measures the same individuals over time, treating time as an independent variable.

What Is a Within-Subjects Design?

In a within-subjects design, the entire sample is exposed to all treatments or conditions, and the analysis compares each person’s responses across those conditions. The defining features are:

  • Every participant, every condition. No one is left out of any level of the independent variable.
  • Related (dependent) measurements. The scores being compared come from the same individuals, so they are statistically paired.
  • No separate control group. Participants are compared with themselves; a pretest functions like a control condition taken before any treatment, and a posttest is taken afterward.
  • Two typical goals. Measuring change produced by different treatments (e.g., five message styles) or change over time (e.g., attitudes tracked across a year), for outcomes such as attitudes, learning, or performance.

How a Within-Subjects Design Works

Comparing different treatments

When the conditions are different treatments, every participant completes all of them, usually in a single session or a short series of sessions, and the order of conditions is varied across participants. Filler tasks or unrelated questions are often added so participants cannot guess the aim of the study.

Comparing measurements over time

In longitudinal research, time itself is the independent variable. Because researchers cannot manipulate or prevent the passage of time, longitudinal studies usually examine correlations between time and the outcome rather than strict causal effects. For example, a social researcher might survey the same panel of respondents every two to three months, asking them to rate their fear of infection during a pandemic on a 7-point Likert scale, and then track how those ratings shift over the months.

Counterbalancing vs. randomizing condition order

Whenever multiple treatments are compared within subjects, the order of conditions should be randomized or counterbalanced so that earlier treatments do not systematically spill over onto later ones.

ApproachHow it worksKey feature
CounterbalancingThe researcher selects a limited number of fixed sequences (e.g., A-B-C-D-E, B-E-A-C-D, D-A-B-E-C) and assigns an equal share of participants to each sequence.Controlled: ideally each treatment appears equally often in each serial position, balancing order effects across the sample.
RandomizationA computer generates the order of conditions afresh for each participant, so any possible sequence may occur.Uncontrolled frequencies: the researcher cannot guarantee how often each sequence appears across the group.

Counterbalancing is often more convenient because the researcher knows exactly which sequences occur and how often, whereas full randomization leaves sequence frequencies to chance.

Examples of Within-Subjects Designs

Example from the social sciences: message styles and generosity

Suppose you are studying how different messaging styles (independent variable) affect generosity (dependent variable). Every participant reads five short stories about climate change, each written in a different tone and style. After each story, participants report how they feel and how willing they are to donate to a related cause; unrelated filler questions disguise the purpose of the study. You then compare willingness to donate across the five conditions within each participant, after counterbalancing or randomizing the order of the stories.

Example from the social sciences: tracking attitudes over time

A panel study recruits a large sample early in a public-health crisis and re-surveys the same people every few months. Because each respondent provides repeated ratings, changes in perception can be assessed within individuals over time, which is something no single-snapshot between-subjects survey could reveal.

Example from the biomedical sciences: a crossover trial

The classic biomedical application of a within-subjects design is the crossover trial. To compare two inhalers for asthma:

  • Each patient uses Inhaler A for four weeks, completes a washout period during which the first drug’s effects wear off, and then uses Inhaler B for four weeks (or in the reverse order).
  • Patients are randomized to the A-then-B or B-then-A sequence, which counterbalances order.
  • Lung function is measured at the end of each treatment period and compared within each patient.

Because each patient’s physiology, age, disease severity, and lifestyle stay constant across both treatments, the comparison is far less noisy than a parallel-group trial, and far fewer patients are needed.

Note: Crossover trials are only appropriate for stable, chronic conditions and reversible treatments; they are unsuitable when a treatment cures the condition or when carryover cannot be eliminated by a washout.

Within-Subjects vs. Between-Subjects Design

In a between-subjects design, each participant experiences only one condition, and separate groups, typically one or more experimental groups plus a control group, are compared with each other. The table below contrasts the two designs.

FeatureWithin-subjects designBetween-subjects design
Conditions per participantAllOne
Other namesRepeated measures, dependent groupsIndependent measures, between-groups
ControlParticipants serve as their own control (baseline/pretest)Separate control group (no treatment, usual care, or placebo)
Sample size neededSmaller; each person yields a data point for every conditionOften double or more for the same statistical power
Time per participantLonger (multiple conditions or repeated visits)Shorter (a single session)
Individual differencesControlled; the same people appear in every conditionA threat; groups may differ in ability, motivation, or health
Main validity threatsCarryover, practice, order, fatigue, and time-related effectsGroup non-equivalence and selection effects
Typical analysesPaired t test, repeated measures ANOVAIndependent-samples t test, one-way ANOVA
Biomedical archetypeCrossover trialParallel-group randomized controlled trial

Worked contrast:

To test whether learning environment (on campus vs. online) affects test scores, a between-subjects design would randomize students to take a course either on campus or online and compare the groups’ test scores. A within-subjects design would have every student take half the course on campus and half online, with the order randomized across students, and would compare each student’s scores under the two formats.

Benefits of a Within-Subjects Design

Within-subjects designs can detect causal or correlational relationships with relatively small samples, for three connected reasons:

  • Smaller samples and lower cost. Each participant supplies repeated measures, so far fewer people are needed. Recruitment is easier and the study is more cost-effective per data point.
  • Individual differences are removed from the comparison. In a between-subjects study, characteristics such as intelligence, memory capacity, or disease severity vary between the groups and can masquerade as treatment effects. In a within-subjects study, the same individuals appear in every condition, so those characteristics are held constant.
  • Greater statistical power. Removing between-person variation shrinks the error term. A between-subjects design typically needs about twice as many participants (or more) to achieve the power of an equivalent within-subjects design. Simulation research on mediation analysis likewise finds that within-subjects designs need roughly half the sample to detect indirect effects of the same size, though the exact advantage depends on effect sizes and the correlation between repeated measures.

Problems with a Within-Subjects Design

The biggest drawbacks are threats to internal validity. Repeated testing, sometimes over long periods, opens the door to alternative explanations of the results.

Time-related effects

Several threats apply specifically to designs that test the same people across time:

  • History: an unrelated external event (e.g., a lockdown, a policy change, a news story) occurs between measurements and influences the outcome.
  • Maturation: natural physical or psychological change, such as growth, aging, healing, or developing skills, produces the observed differences rather than the treatment.
  • Subject attrition: participants drop out at each successive wave, leaving a biased final sample in which only the most motivated (or healthiest) individuals remain.

Carryover effects

Carryover effects are a broad family of threats that arise when an earlier condition alters responses in a later one:

  • Practice (learning) effects: familiarity with the task gained in early conditions improves performance in later conditions.
  • Order effects: the serial position of a condition changes outcomes, for instance, participants grow bored or fatigued and pay less attention in the final condition.
  • Sequence effects: conditions interact based on their order, as when raters judge later advertisements by comparing them with the ones they saw earlier.
  • Participant fatigue deserves special mention: people who complete several treatments in a row can become tired, bored, or unmotivated, degrading data quality in the later conditions.

How to minimize these threats

  • Randomize or counterbalance the order of conditions across participants.
  • Insert washout periods (in drug studies) or breaks and filler tasks between conditions.
  • Use parallel forms of tests so that the identical items are never repeated.
  • Keep sessions short and offer incentives to limit fatigue and attrition.
  • Track and report dropout, and compare completers with non-completers to gauge attrition bias.

Combining Within- and Between-Subjects Designs

With two or more independent variables, the two approaches can be combined in a factorial design, in which every level of each independent variable is crossed with every level of the others. In a mixed factorial design, one variable is manipulated between subjects and another within subjects.

Example:

An education researcher investigates whether teaching method affects second-language learning in 8th-grade students. Students are randomly assigned to standard or experimental teaching methods (the between-subjects factor), and every student is tested before, midway through, and after the course (the within-subjects factor). Scores are then analyzed for differences across time, between groups, and for their interaction. Longitudinal studies of this kind can be genuinely experimental when the researcher directly manipulates one independent variable and controls assignment to its conditions.

When choosing between the two designs, methodologists advise weighing three factors together: validity (within-subjects designs face carryover, participant awareness, and measurement artifacts), causality (between-subjects designs require no-confounder assumptions, while within-subjects designs require the assumption of no carryover), and statistical power (which favors within-subjects designs). Higher power alone is not a sufficient reason to choose a within-subjects design.

Frequently Asked Questions

What is the difference between a within-subjects design and a between-subjects design?

In a within-subjects design, each participant experiences all conditions, and the same people are tested repeatedly so conditions can be compared within individuals. In a between-subjects design, every participant experiences only one condition, and outcomes are compared between separate groups. “Within” means comparing conditions within the same group; “between” means comparing conditions between groups.

Why do within-subjects designs need fewer participants than between-subjects designs?

Because every participant contributes a data point to every condition, and because comparing people with themselves removes between-person variability from the error term. The result is higher statistical power per participant: a between-subjects design typically needs roughly twice as many people to detect the same effect, and simulations of mediation analysis show a similar two-to-one sample-size advantage for within-subjects designs.

What is a 2×2 within-subjects design?

A 2×2 within-subjects design is a factorial design with two independent variables, each having two levels, in which every participant experiences all four combinations of conditions. It lets researchers estimate the effect of each variable and their interaction on a single dependent variable using one sample of participants.

How do you control order effects in a within-subjects design?

The two main tools are counterbalancing, assigning equal portions of the sample to a limited set of fixed condition sequences so each treatment appears equally often in each position, and randomization, generating a fresh random order for each participant. Washout periods, breaks, filler tasks, and parallel test forms further reduce carryover, practice, and fatigue effects.

When should a within-subjects design not be used?

Avoid a within-subjects design when practice or carryover effects cannot be eliminated, when you need to observe treatment effects under minimum practice (repeated testing inevitably gives later conditions extra practice), when a treatment is irreversible or curative (e.g., surgery), when contrasting conditions would reveal the hypothesis to participants, or when long repeated testing would cause heavy fatigue or attrition. In those situations, a between-subjects design protects internal validity better despite needing a larger sample.

Related post

Featured post

Comment

There are no comment yet.

TOP