2026.06.15
2026.07.26

What Is an Umbrella Review? Definition, Methods, Tools, Examples

Table of Contents

Glossary of Key Terms

Before diving in, here are the key terms you will encounter throughout this guide, defined concisely for quick reference.

UMBRELLA REVIEW (UR) A systematic review whose unit of analysis is other systematic reviews or meta-analyses. Also called ‘review of reviews’ or ‘overview of reviews.’	SYSTEMATIC REVIEW (SR) A comprehensive, reproducible synthesis of all available primary research on a defined question, using explicit pre-specified methods.
META-ANALYSIS (MA) A statistical technique used within a systematic review to mathematically pool results from multiple studies to produce a single summary effect estimate.	SRMA Systematic Review and/or Meta-Analysis — the primary unit of inclusion in an umbrella review.
AMSTAR-2 A Measurement Tool to Assess Systematic Reviews (version 2). A 16-item validated checklist used to critically appraise the methodological quality of included SRMAs.	GRADE Grading of Recommendations, Assessment, Development and Evaluations. A system for rating the certainty of evidence and the strength of clinical recommendations.
ROBIS Risk Of Bias In Systematic reviews. A tool for assessing the risk of bias in systematic reviews, complementary to AMSTAR-2.	JBI Joanna Briggs Institute. Publishes the primary methodological handbook for umbrella reviews (Chapter 10 of the JBI Manual for Evidence Synthesis).
PICO(S) Population, Intervention, Comparison, Outcome (Study design). The framework used to structure research questions in intervention-based reviews.	PRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses. The dominant reporting guideline; a PRISMA-OvR extension exists for overviews of reviews.
HETEROGENEITY (I²) A statistical measure of variability in results across studies in a meta-analysis. High I² (>75%) signals that results differ substantially beyond chance.	CORRECTED COVERED AREA (CCA) A metric used to quantify the degree of overlap of primary studies across multiple meta-analyses included in an umbrella review.
PROSPERO International Prospective Register of Systematic Reviews. The standard registry for pre-registering review protocols to increase transparency.	PREDICTION INTERVAL (PI) A range capturing where the true effect size would be expected in a new study. Wider PIs indicate higher uncertainty about the generalizability of an effect.
SMALL STUDY EFFECTS The tendency for smaller studies to report larger effect sizes, potentially indicating publication bias. Assessed via Egger’s test and funnel plot asymmetry.	EXCESS SIGNIFICANCE BIAS When the number of statistically significant results in a meta-analysis exceeds what would be expected by chance — a marker of potential selective reporting.

What Is an Umbrella Review?

An umbrella review is a systematic review of previously published systematic reviews and/or meta-analyses. Where a typical systematic review synthesizes primary studies (randomized controlled trials, cohort studies, etc.), an umbrella review sits one level higher: its building blocks are themselves the products of evidence synthesis.

The name is apt: an umbrella review holds multiple systematic reviews under a single canopy, offering a panoramic view of an entire research landscape rather than a narrow cross-section of it.

CORE DEFINITION

An umbrella review is a systematic review whose unit of analysis is other systematic reviews or meta-analyses, aggregating findings from several reviews that address specific questions under a shared topic. Each umbrella review focuses on a broad condition or problem for which there are two or more potential interventions, exposures, or outcomes of interest.

It is also known by several other names in the literature:

Overview of reviews (the preferred Cochrane terminology)
Review of reviews
Meta-review
Summaries of systematic reviews
Syntheses of reviews

These terms are used interchangeably across different institutions and journals, though methodological purists sometimes make fine distinctions between them.

Why Umbrella Reviews Exist

The exponential growth of biomedical publishing has created a paradox: researchers now have access to more evidence than ever before, yet synthesizing it all is increasingly impossible. As the number of systematic reviews has grown year-on-year, a new level of synthesis became necessary.

Thousands of systematic reviews are published annually across medicine, psychology, and public health
Multiple competing SRMAs often exist on the same narrow question, reaching different conclusions
Clinicians, guideline developers, and policymakers need a single, authoritative synthesis and not a reading list of competing reviews
Health technology assessments that evaluate all management options for a condition benefit enormously from a single umbrella document
A field may have been split into focused populations or interventions across many reviews; an umbrella review restores coherence by bringing them together

THE LUMPING VS. SPLITTING PROBLEM

Umbrella reviews solve a structural challenge: research on a broad topic tends to be systematically reviewed in narrow slices (by subgroup, intervention variant, or outcome). The umbrella review restores the comprehensive picture by synthesizing those slices without having to re-examine thousands of primary studies.

When to Use an Umbrella Review

Not every research question warrants an umbrella review. The method is most appropriate under specific conditions.

Ideal Conditions for an Umbrella Review

Two or more existing high-quality systematic reviews already exist on the topic
The field has been well-covered by SRMAs but lacks an overarching synthesis
Multiple competing interventions or exposures exist for a single condition
Guideline developers need a broad, defensible evidence base quickly
The question is broad enough that a single SR would be unwieldy or uninformative
Decision-makers need a rapid orientation to the state of evidence across a complex topic
There is confusion or contradiction between existing SRMAs that a higher-level synthesis could clarify

When NOT to Use an Umbrella Review

Existing systematic reviews are absent, sparse, or of very low quality
The research question is narrow enough that a single SR would suffice
The topic area is too new for secondary evidence to exist
Subgroup analyses of individual patient data are needed (an umbrella review cannot do this)
The goal is to re-analyse primary data; umbrella reviews work only at the review level

Umbrella Reviews vs. Other Review Types

Review Type	Unit of Analysis	Scope	Question Type	Typical Output
Umbrella Review	Systematic reviews & meta-analyses	Very broad	Multiple interventions/exposures/outcomes	Panoramic evidence map; evidence grading
Systematic Review	Primary studies (RCTs, cohorts, etc.)	Narrow–moderate	Specific PICO question	Pooled or narrative synthesis
Meta-Analysis	Quantitative data from primary studies	Narrow	One or few outcomes	Pooled effect estimate (OR, RR, MD)
Scoping Review	Primary studies (any design)	Broad	Landscape mapping; concept clarification	Map of evidence; research gaps
Narrative Review	Primary studies (selective)	Variable	Background, educational	Expert-guided summary; not reproducible
Rapid Review	Primary studies or SRs	Narrow–moderate	Policy-urgent questions	Abbreviated synthesis with speed tradeoffs
Realist Review	Primary studies	Variable	What works for whom in what context?	Mechanistic explanations; theory-building

Umbrella Review vs. Systematic Review: Key Differences

Dimension	Systematic Review	Umbrella Review
Building block	Primary study	Systematic review or meta-analysis
Eligibility criteria	Focused and specific	Broader, but still pre-specified
Search terms	Disease/intervention keywords	Disease/topic + systematic review OR meta-analysis
Quality appraisal tool	Cochrane RoB, ROBINS-I, NOS, etc.	AMSTAR-2, ROBIS, JBI checklist
Statistical re-analysis	Meta-analysis of primary data	Re-run existing meta-analyses using standardized methods
Time to complete	12–24 months (typical)	6–10 weeks (professional teams)
Overlap concern	Duplicate data in pooled analyses	Same primary studies across multiple SRs (use CCA)
Evidence hierarchy position	High	Highest currently available

The Evidence Hierarchy

Umbrella reviews sit at the apex of the evidence pyramid. Understanding where they stand relative to other study designs clarifies their unique value.

Umbrella Reviews

Systematic Reviews & Meta-Analyses

Randomized Controlled Trials (RCTs)

Cohort & Prospective Studies

Case-Control & Cross-Sectional Studies

Case Reports & Case Series

Expert Opinion & Editorials

Umbrella reviews represent one of the highest levels of evidence synthesis currently available. They do not directly include primary studies; instead, they synthesize the work of dozens or hundreds of individual systematic reviews. This means that their conclusions rest on a very broad base of underlying research.

Registering an Umbrella Review Prospectively

Step 1: Develop a Protocol Before Starting Review Work

A protocol should be finalized before study selection/screening begins (and at minimum, before data extraction starts). It typically includes:

Research question(s) in PICO/PECO format
Eligibility criteria for systematic reviews to be included
Search strategy and databases
Quality appraisal tool(s) (e.g., AMSTAR-2, ROBIS, JBI)
Data extraction and synthesis plan
Handling of overlapping primary studies across reviews

Step 2: Choose a Registry

Registry	Cost	Notes
PROSPERO (CRD, University of York)	Free	Eligible review types include systematic, rapid, umbrella, diagnostic accuracy, prognostic, and methodological reviews; scoping reviews are not eligible
OSF (Open Science Framework)	Free	Common alternative, especially for non-PROSPERO-eligible designs
INPLASY	Paid	Faster turnaround, broader eligibility
Research Registry	Paid	Alternative option
JBI Systematic Review Register	Free (JBI members)	Title registry for JBI-affiliated reviews only

PROSPERO is the most widely recognized and commonly used registry for umbrella reviews in health research.

Step 3: Timing of Registration

Registration should take place once the review protocol has been finalised, but ideally before screening studies for inclusion begins.
Reviews are accepted for registration as long as they have not progressed beyond completion of data extraction.
Completed reviews are not accepted. Registration must occur before the review is finished.
Registration must occur before data extraction begins; earlier is better and more credible for accountability purposes.

Step 4: Register on PROSPERO

Go to the PROSPERO website (CRD, University of York)
Create an account
Complete the online registration form, including:
- Review title and team details
- Anticipated start/completion dates
- Condition/domain being studied
- Comparator/exposure and outcomes
- Search strategy summary
- Quality appraisal tool(s) to be used
Submit for review by PROSPERO administrators
Receive a unique CRD registration number (format: CRD4YYYYxxxxxx)

Step 5: Cite the Registration

A unique CRD identifier issued at approval is used by journals, peer reviewers, and funders to verify the published review matches the registered plan.
Include the registration number in:
- The protocol publication (if published separately)
- The final manuscript’s abstract and methods section

Step 6: Handling Amendments

Amendments are dated and version-tracked.
Deviations from the registered protocol are described in the manuscript’s methods section.

Example Registration Numbers (Real Umbrella Reviews)

Umbrella Review Topic	PROSPERO ID
Psychological interventions post-stroke/TIA	CRD42022375947
Ethnic diversity in RCTs	CRD42022325241
SNPs and lung cancer risk	CRD42020204685
PPI with children/families	CRD42024608935
Postpartum depression risk factors	CRD420251249033

Quick Checklist

Protocol drafted (PICO, eligibility, search, appraisal, synthesis plan)
Registration completed before screening/data extraction
Quality appraisal tool specified in advance (AMSTAR-2, ROBIS, JBI)
PROSPERO/OSF number obtained
Registration number cited in final publication
Any protocol deviations documented and explained

This is a general informational overview based on current registry guidance; if precise current PROSPERO eligibility criteria are critical to your submission, it’s worth double-checking directly on the PROSPERO website, as policies can be updated periodically.

How to Conduct an Umbrella Review: Step-by-Step Methodology

Conducting an umbrella review is a rigorous process. Below are the essential steps, drawn from the leading methodological guidance documents including the JBI Manual, the BMJ Medicine guidelines, and published frameworks from Fusar-Poli & Radua.

Confirm the Review Is Needed & Register a Protocol

Before beginning, verify that no equivalent umbrella review has already been published or is underway (check PROSPERO and published literature). Pre-specify your protocol and register it on PROSPERO. This prevents selective reporting and establishes transparency. Protocol registration is increasingly required by journals.

Define the Research Question (PICO/PECO Framework)

Clearly articulate what you are asking. For intervention reviews, use the PICO framework (Population, Intervention, Comparison, Outcome). For epidemiological umbrella reviews, define the population(s), risk factor(s) or exposure(s), and outcome(s). The scope should be broader than a typical SR but still precisely delimited.

Develop Explicit Eligibility Criteria

Specify which types of SRMAs will be included and excluded. Typical criteria address publication type, language and date restrictions, minimum quality thresholds (or the decision to include all regardless of quality), whether to include SRMAs of observational studies only or RCTs only or both, and the definition of relevant population, intervention/exposure, and outcomes.

Construct a Two-Part Search Algorithm

The search string has two components combined using boolean AND: (1) a study design filter identifying SRMAs (e.g., ‘systematic review*’ OR ‘meta-analys*’), and (2) a topic filter covering all relevant keywords, MeSH terms, and synonyms. Search multiple databases (MEDLINE, Embase, Cochrane Library, PsycINFO) and grey literature sources.

Screen Literature Independently (Double Screening)

Two independent reviewers screen titles and abstracts, then full texts, using the pre-specified eligibility criteria. Disagreements are resolved through discussion or a third reviewer. Document exclusion reasons at the full-text stage and present results in a PRISMA flow diagram.

Manage Overlap Between SRMAs

When multiple SRMAs cover the same primary studies and outcomes, decide which to include. Common strategies: choose the most recent SRMA; choose the SRMA with the largest number of studies; choose the SRMA with the highest AMSTAR-2 quality; for epidemiological reviews, choose the SRMA with the most prospective studies. Calculate the Corrected Covered Area (CCA) to quantify overlap.

Extract Data Using Standardized Forms

Two reviewers independently extract data. For each included SRMA, extract: number of included studies and total sample size; study-specific effect estimates and 95% confidence intervals; heterogeneity statistics (I², Cochran’s Q, tau²); any risk-of-bias or quality assessments reported within the SRMA; and descriptive conclusions for narrative reviews without meta-analysis.

Re-Run Meta-Analyses with Standardized Methods

Rather than simply reporting the pooled estimates as published, re-run each meta-analysis using standardized statistical models. This ensures comparability across all included SRMAs. Conduct consistency checks and assess heterogeneity and potential biases (small study effects, excess significance bias) uniformly.

Assess Methodological Quality of Included SRMAs

Apply validated appraisal tools, most commonly AMSTAR-2, to each included SRMA. Appraisal should be done independently by two reviewers, with consensus discussion for discrepancies. Summarize quality findings in a table.

Grade the Strength of Evidence

For intervention reviews: apply GRADE. For epidemiological umbrella reviews: use criteria assessing amount of evidence, statistical significance, heterogeneity, small study effects, and excess significance bias. Consider performing sensitivity analyses restricted to prospective studies to examine temporality of associations.

Report Results Transparently

Report in both tabular and graphical formats. Key elements include summary tables of all meta-analyses with key statistics, evidence grading tables, PRISMA flow diagrams, and optional visual plots (forest plots, Manhattan plots). Address contradictory conclusions across SRMAs explicitly.

Interpret Findings Carefully

Interpret with attention to confounding (for observational reviews), clinical relevance, external validity/generalizability, and the limitations of the included SRMAs. Causal claims require extreme caution. Discuss biological plausibility and cite supporting evidence from other methodologies (e.g., Mendelian randomisation studies).

What is AMSTAR-2?

AMSTAR-2 (A MeaSurement Tool to Assess systematic Reviews, version 2) is a 16-item checklist used to evaluate the methodological quality of systematic reviews, including those with or without meta-analysis. In an umbrella review (a review that synthesizes findings from multiple systematic reviews/meta-analyses on related topics), AMSTAR-2 is applied to each included systematic review to judge how trustworthy its conclusions are before they’re pooled or compared.

Why AMSTAR-2 Matters in Umbrella Reviews

Umbrella reviews combine evidence from many systematic reviews, often covering overlapping primary studies.
The overall conclusions are only as reliable as the weakest review included.
AMSTAR-2 helps flag reviews with serious methodological flaws so their findings can be interpreted with appropriate caution or downweighted.

The 16 Items

#	Domain	Critical?
1	PICO components in research questions	No
2	Pre-established protocol	No
3	Explanation for study design selection	No
4	Comprehensive literature search	Yes
5	Study selection in duplicate	No
6	Data extraction in duplicate	No
7	List of excluded studies with justification	Yes
8	Adequate description of included studies	No
9	Satisfactory risk of bias (RoB) assessment	Yes
10	Reporting of funding sources for included studies	No
11	Appropriate meta-analysis methods	Yes
12	Assessment of RoB impact on results	Yes
13	Accounting for RoB when interpreting results	Yes
14	Explanation of heterogeneity	No
15	Investigation of publication bias	Yes
16	Disclosure of conflicts of interest	No

Rating Each Review

Each item is rated as:

Yes
Partial Yes
No

Overall Confidence Ratings

Based on the pattern of weaknesses across the 7 critical domains, each systematic review receives an overall rating:

Rating	Criteria
High	No or one non-critical weakness
Moderate	More than one non-critical weakness
Low	One critical flaw, with or without non-critical weaknesses
Critically Low	More than one critical flaw

How Umbrella Review Authors Use AMSTAR-2

Two independent raters typically apply AMSTAR-2 to each included review, resolving disagreements by consensus or a third reviewer.
Results are usually presented in a summary table showing each review’s score per item and its overall confidence rating.
Critically low quality reviews may be:
- Excluded from the primary synthesis
- Reported separately as supporting/contextual evidence
- Used in sensitivity analyses
Confidence ratings often feed into the overall certainty of evidence (alongside tools like GRADE) when drawing umbrella-level conclusions.

Common Reporting Table in Umbrella Reviews

Included Review	Item 4	Item 7	Item 9	Item 11	Item 12	Item 13	Item 15	Overall Rating
Review A	Yes	Partial Yes	Yes	Yes	No	Yes	Yes	Moderate
Review B	No	No	Partial Yes	Yes	No	No	No	Critically Low

Key Limitation

AMSTAR-2 assesses how a review was conducted, not whether its findings are correct. A methodologically strong review can still report null or modest effects, and a flawed one can still report a true effect. It’s a quality lens, not a validity verdict on the underlying evidence itself.

What is GRADE?

GRADE (Grading of Recommendations Assessment, Development and Evaluation) is a framework for rating the certainty of evidence for a given outcome and, where applicable, the strength of recommendations. In an umbrella review, GRADE is applied to the body of evidence underlying each outcome reported across the included systematic reviews/meta-analyses and it helps readers judge how much confidence to place in each summary effect.

Why GRADE Matters in Umbrella Reviews

Umbrella reviews often report many outcome-exposure or outcome-intervention associations.
Not all associations are equally trustworthy, even if statistically significant.
GRADE provides a standardized way to communicate how confident readers should be that the reported effect reflects the true effect.

Starting Point Based on Study Design

Body of Evidence	Starting Certainty
Randomized controlled trials (RCTs)	High
Observational studies (cohort, case-control)	Low

The 5 Domains That Can Downgrade Certainty

Domain	What It Assesses
Risk of bias	Methodological limitations in the primary studies underlying the reviews
Inconsistency	Unexplained heterogeneity in effect estimates across studies/reviews
Indirectness	Differences in population, intervention, comparator, or outcome from the question of interest
Imprecision	Wide confidence intervals or small sample sizes/event numbers
Publication bias	Evidence that studies with null/negative results are missing

3 Factors That Can Upgrade Certainty (Mainly for Observational Evidence)

Factor	Description
Large effect size	Strong or very strong magnitude of association
Dose-response gradient	Effect increases consistently with exposure level
Plausible confounding	Confounders would likely reduce, not create, the observed effect

Final Certainty Ratings

Rating	Symbol	Interpretation
High	⊕⊕⊕⊕	True effect is close to the estimated effect
Moderate	⊕⊕⊕◯	True effect is probably close to estimate, but could differ
Low	⊕⊕◯◯	True effect may be substantially different
Very Low	⊕◯◯◯	Very little confidence in the estimate

How Umbrella Review Authors Apply GRADE

Often combined with other classification systems specific to umbrella reviews (e.g., evidence classes based on significance, sample size, and 95% prediction intervals), but GRADE remains the most widely recognized certainty framework.
Each outcome/association is assessed individually, since certainty can vary even within the same umbrella review.
Findings are typically presented in a GRADE summary table alongside effect estimates.

Example Summary Table

Outcome	Effect Estimate (95% CI)	Risk of Bias	Inconsistency	Indirectness	Imprecision	Publication Bias	Overall Certainty
Outcome 1	RR 1.45 (1.20–1.75)	Not serious	Serious	Not serious	Not serious	Not serious	Moderate
Outcome 2	OR 0.88 (0.60–1.30)	Serious	Serious	Not serious	Serious	Not detected	Very Low

GRADE vs AMSTAR-2

AMSTAR-2 assesses the methodological quality of each included systematic review (the “wrapper”).
GRADE assesses the certainty of the evidence for each outcome (the underlying findings).
Used together, they give umbrella review readers a fuller picture: was the review conducted well, AND can we trust its reported effect?

Key Limitation

GRADE was originally developed for single systematic reviews informing clinical guidelines, so applying it within umbrella reviews requires adaptation. This is particularly relevant when you are summarizing across multiple overlapping reviews with shared primary studies, which can complicate judgments about imprecision and inconsistency.

Managing Overlap of Primary Studies in Umbrella Reviews

One of the most technically complex challenges in umbrella reviews is overlap: many systematic reviews on the same topic will have included some or all of the same primary studies. If left unaddressed, overlap can inflate the apparent evidence base and artificially narrow confidence intervals. This would make results look more precise than they actually are.

Why Overlap Matters

A single large, high-quality RCT included in five separate meta-analyses may have disproportionate influence on the overall umbrella review finding
Overlap that inflates sample sizes can produce misleadingly low p-values, increasing Type I error (false positive) risk
Contradictory conclusions between SRMAs on the same question may stem from differing eligibility criteria, search dates, or statistical methods

Strategies for Handling Overlap

Strategy	Description	Best Used When
Corrected Covered Area (CCA)	Quantifies the proportion of primary studies shared across SRMAs. CCA <0.05 = slight; 0.05–0.10 = moderate; 0.11–0.15 = high; >0.15 = very high overlap.	Always — as a reporting measure alongside any overlap decisions
Restrict by recency	Among overlapping SRMAs, include only the most recently published version	When the topic has evolved rapidly and newer SRMAs incorporate more updated evidence
Restrict by size	Prefer the SRMA with the most included primary studies	When comprehensiveness of coverage is the priority
Restrict by quality	Use AMSTAR-2 scores to select the methodologically highest-quality SRMA	When methodological rigour is the priority
Include all + CCA reporting	Include all SRMAs and quantify overlap using CCA; interpret results in light of the overlap magnitude	When comprehensiveness and transparency are prioritized; common in observational umbrella reviews

Important clarification

An umbrella review does not combine or re-pool results from different meta-analyses in a grand statistical synthesis. It describes and grades the evidence from each SRMA separately. Therefore, overlap does not carry the same statistical risk as in a traditional meta-analysis, but it must still be acknowledged and managed for interpretive clarity.

Quality Assessment Tools for Umbrella Reviews

Evaluating the methodological quality of included SRMAs is essential to interpreting umbrella review findings. The quality of an umbrella review is ultimately contingent on the quality of its constituent SRMAs.

AMSTAR-2 (Primary Tool)

AMSTAR-2 is the most widely used tool for appraising systematic reviews in the context of umbrella reviews. It covers 16 items and distinguishes between critical and non-critical domains.

ROBIS for Umbrella Reviews

ROBIS (Risk of Bias in Systematic Reviews) is a tool specifically developed to assess the risk of bias — rather than just reporting quality — of systematic reviews included in an umbrella review.

Structure: 3 Phases

Phase	Purpose
Phase 1	Assess relevance of the review to the umbrella review’s question (optional)
Phase 2	Identify concerns in 4 domains via signaling questions
Phase 3	Judge overall risk of bias based on Phase 2

The 4 Domains (Phase 2)

Domain	Focus
1. Study eligibility criteria	Were inclusion criteria appropriate and clearly defined before conducting the review?
2. Identification & selection of studies	Was the search comprehensive and selection bias minimized?
3. Data collection & study appraisal	Were data extraction and risk-of-bias assessments of primary studies done appropriately?
4. Synthesis & findings	Were methods for synthesis, and interpretation of results, appropriate?

Signaling Question Responses

Yes
Probably Yes
Probably No
No
No Information

Overall Risk of Bias Judgment

Rating	Meaning
Low	Few or no concerns across domains
High	Concerns in one or more domains significantly affecting confidence
Unclear	Insufficient information to judge

Use in Umbrella Reviews

Each domain receives a low/high/unclear rating, then an overall risk of bias judgment is made for the review as a whole.
Often used as an alternative or complement to AMSTAR-2. Some umbrella reviews use both for triangulation.
Reviews rated high risk of bias may be flagged, excluded from primary synthesis, or interpreted cautiously.

ROBIS vs. AMSTAR-2

Feature	ROBIS	AMSTAR-2
Primary focus	Risk of bias	Methodological quality
Domains	4	7 critical + 9 non-critical (16 total)
Overall rating	Low/High/Unclear	High/Moderate/Low/Critically Low
Common in	Public health, epidemiology	Health interventions broadly

JBI Critical Appraisal Checklist for Umbrella Reviews

The JBI (Joanna Briggs Institute) Critical Appraisal Checklist for Systematic Reviews and Research Syntheses is part of a suite of design-specific JBI tools, widely used in nursing, allied health, and JBI-affiliated reviews.

Checklist Items (11 Questions)

#	Question Focus
1	Were the review questions clearly stated?
2	Were inclusion criteria appropriate?
3	Was the search strategy appropriate?
4	Were sources/resources for studies adequate?
5	Were criteria for appraising studies appropriate?
6	Was critical appraisal conducted by ≥2 reviewers independently?
7	Were methods to minimize errors in data extraction used?
8	Were appropriate methods used to combine studies?
9	Was likelihood of publication bias assessed?
10	Were recommendations for policy/practice supported by data?
11	Were specific directives for new research appropriate?

Response Options

Yes
No
Unclear
Not Applicable

Use in Umbrella Reviews

Particularly favored when the umbrella review itself follows JBI methodology (which has its own formal guidance for conducting umbrella reviews).
Results presented in a simple summary table (reviews × items, with Y/N/U/NA responses).
No formal overall “score” or cutoff; appraisal informs narrative judgment about including/weighting a review.
Often paired with JBI’s own umbrella review conduct guidelines, creating methodological consistency for JBI-affiliated authors.

JBI Checklist vs. AMSTAR-2

Feature	JBI Checklist	AMSTAR-2
Item count	11	16
Overall rating system	None (narrative)	Yes (4-tier)
Discipline association	Nursing/allied health	General health sciences
Companion conduct guidance	JBI Umbrella Review Methodology	None specific

PRISMA Checklist for Umbrella Reviews

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) is not a quality or risk-of-bias appraisal tool. It is a reporting guideline ensuring transparency and completeness of how a review (including umbrella reviews) is described.

Key Distinction

Tool	What It Assesses
AMSTAR-2 / ROBIS / JBI	Methodological quality / risk of bias
PRISMA	Completeness and transparency of reporting

PRISMA Components

Component	Description
27-item checklist	Covers title, abstract, methods, results, discussion, funding
PRISMA flow diagram	Visualizes study identification, screening, exclusion, and inclusion
PRISMA extensions	Specialized versions (e.g., PRISMA-P for protocols; no dedicated umbrella-review version, but PRISMA 2020 is commonly adapted)

Use in Umbrella Reviews

Authors use PRISMA to structure and report the umbrella review itself and not to appraise the included systematic reviews.
A completed PRISMA flow diagram documents how many systematic reviews were identified, screened, excluded (with reasons), and included.
Journals frequently require a PRISMA checklist submission alongside the manuscript.
Improves reproducibility and transparency, allowing readers to trace the selection process for included reviews.

Relationship to Quality Appraisal Tools

PRISMA compliance does not indicate methodological quality. A review can be well-reported but methodologically weak (low AMSTAR-2/high ROBIS risk), or vice versa.
PRISMA is therefore typically used alongside, not instead of, AMSTAR-2, ROBIS, or JBI tools in umbrella reviews.

Practical note

A common debate in umbrella reviews is whether to exclude low-quality SRMAs from the synthesis. Most methodologists recommend including all SRMAs regardless of quality (to avoid overestimating or underestimating effect sizes from incomplete data), while clearly reporting quality ratings and conducting sensitivity analyses that restrict findings to higher-quality reviews.

How to Grade the Evidence in an Umbrella Review

Quality assessment (is this SRMA well-conducted?) and evidence grading (how strong is the overall body of evidence?) are distinct steps that are often confused.

GRADE for Intervention Reviews

When the umbrella review addresses interventions, GRADE is the validated approach. GRADE evaluates certainty of evidence across five factors:

Risk of bias in the underlying studies
Inconsistency: unexplained variability across studies
Indirectness: applicability of evidence to the specific question
Imprecision: width of confidence intervals
Publication bias: likelihood of selective reporting

Evidence is rated as: High → Moderate → Low → Very Low certainty.

Ioannidis Criteria for Epidemiological Reviews

For umbrella reviews of observational or epidemiological associations (risk factors, predictors), a widely used complementary framework evaluates each meta-analysis on:

Criterion	Description	Threshold (example)
Amount of evidence	Total number of cases or participants	≥1,000 cases
Statistical significance	P-value for the summary effect estimate	P < 0.001
Heterogeneity	I² statistic across studies	I² < 50%
Prediction interval	Does the 95% PI exclude the null?	PI excludes 1.0 (or 0)
Small study effects	Egger’s test or funnel plot asymmetry	P > 0.10 for Egger’s test
Excess significance bias	More significant studies than expected	P > 0.10 for excess significance test

Based on these criteria, each association may be classified as: Convincing, Highly Suggestive, Suggestive, Weak, or Not Significant.

How to Report Umbrella Reviews

Transparent reporting is critical to umbrella review quality. Several reporting guidelines apply:

Reporting Standards for Umbrella Reviews

Guideline	Full Name	Applicable To
PRISMA-OvR	Preferred Reporting Items for Overviews of Reviews	All umbrella/overview reviews; the primary reporting standard
MOOSE	Meta-analysis Of Observational Studies in Epidemiology	Epidemiological umbrella reviews of observational data
PRISMA 2020	Preferred Reporting Items for Systematic Reviews and Meta-Analyses	Baseline guidance applicable to the overall structure
GRADE SoF tables	Summary of Findings tables (GRADE format)	Presenting evidence certainty for intervention reviews
PROSPERO Protocol	International Prospective Register of Systematic Reviews	Pre-registration of protocol before data collection

Key Reporting Elements in Umbrella Review Results

For each included SRMA, report:

Total number of events or cases (binary outcomes) and total sample size
Number of included primary studies
Effect size metric used (OR, RR, HR, MD, SMD)
Meta-analysis model (fixed-effect vs. random-effects)
Summary effect estimate and 95% confidence interval
95% prediction interval
Heterogeneity statistics (I², Cochran’s Q p-value, tau²)
Effect estimate from the largest single included study
Results of small study effects and excess significance tests
Overall evidence grade

Software & Tools for Umbrella Reviews

Tool	Purpose	Stage of Review
Rayyan	AI-assisted title/abstract screening; collaboration between reviewers	Screening
Covidence	Full-text screening, data extraction, conflict resolution	Screening & extraction
EPPI-Reviewer	Systematic review management; useful for large umbrella reviews	All stages
R (metafor package)	Statistical meta-analysis; heterogeneity and bias tests	Statistical analysis
Stata (meta suite)	Meta-analytic modelling; funnel plots; Egger’s test	Statistical analysis
RevMan	Cochrane’s review management tool; forest plots	Analysis & reporting
GROOVE tool	Graphical representation of overlap across systematic reviews	Overlap assessment
GRADEpro GDT	GRADE evidence profiling; Summary of Findings tables	Evidence grading
PROSPERO	Protocol registration registry	Pre-review planning
ATLAS.ti / NVivo	Qualitative data management (for narrative synthesis components)	Qualitative synthesis

Strengths & Limitations of Umbrella Reviews

Strengths

Provides the highest-level, most comprehensive overview of evidence on a broad topic in a single document
Efficient for decision-makers: saves time compared to reading dozens of individual SRMAs
Particularly valuable for health technology assessments evaluating all management options for a condition
Resolves the ‘lumping vs. splitting’ tension in research synthesis
Enables comparison of the strength of evidence across multiple interventions, exposures, or outcomes simultaneously
Can reveal contradictions between existing SRMAs and explain their sources
Standardized re-analysis of each meta-analysis corrects errors in published SRMAs that used inappropriate statistical models
Feasible in 6–10 weeks for professional teams, since primary data re-analysis is not required

Limitations

‘Garbage in, garbage out’: An umbrella review is wholly dependent on the accuracy and rigour of the included SRMAs. If the underlying reviews are biased, the umbrella review inherits those biases.
Cannot fill evidence gaps: If a research area lacks systematic reviews, an umbrella review cannot be conducted.
No individual patient data (IPD) analysis: Because primary data are not re-examined, subgroup analyses at the participant level are not possible.
Overlap inflation: Even with CCA management, overlapping primary studies across SRMAs can create an illusion of more independent evidence than actually exists.
Difficulty in causal inference: For epidemiological umbrella reviews, confounding, reverse causality, and selection bias remain serious threats.
Clinical heterogeneity: Combining SRMAs that varied in their populations, interventions, comparators, and outcomes can make the overall synthesis clinically difficult to interpret.

Annotated Real-World Examples

The following examples illustrate how umbrella reviews function in practice — what questions they asked, what methods they used, and what their findings demonstrated.

EXAMPLE 1 · EPIDEMIOLOGY

Risk Factors for the Onset of Type 2 Diabetes Mellitus

What it asked: Which non-genetic factors are associated with developing type 2 diabetes, and how strong is the evidence for each?
Scope: Synthesized 142 epidemiological associations from multiple SRMAs. Population: individuals without T2DM at study baseline. Exposure: any non-genetic factor. Outcome: incident T2DM.
Overlap handling: When multiple SRMAs covered the same exposure-outcome pair, researchers selected the SRMA with the largest number of prospective studies, to preserve temporality of association (exposure before outcome).
Evidence grading: Applied Ioannidis criteria to each of the 142 associations, classifying each as convincing, highly suggestive, suggestive, weak, or not significant.
Visualization: Results were presented in a comprehensive table of all 142 associations with full statistics, plus a Manhattan plot (a visual borrowed from genomics) that made the panoramic pattern of evidence immediately readable.
Why it was useful: No single systematic review could have assessed 142 associations simultaneously. The umbrella review revealed which risk factors had convincing, consistent evidence and which were spurious — directly informing prevention guidelines.

EXAMPLE 2 · PHARMACOLOGY / WOMEN’S HEALTH

Menopausal Hormone Therapy and Women’s Health

What it asked: What are the effects of menopausal hormone therapy (MHT) across a wide range of health outcomes: cardiovascular disease, fractures, cancer, cognition, and others?
Why an umbrella review was needed: Dozens of systematic reviews existed on individual outcomes of MHT. Clinicians and guideline developers needed a single document that assessed and compared the evidence across all relevant outcomes at once.
Methods: Included SRMAs on randomized and observational designs. Quality was assessed using AMSTAR-2. Evidence certainty was graded using GRADE, yielding High/Moderate/Low/Very Low ratings for each outcome.
Key value: The umbrella review showed, side-by-side, that MHT has moderate-certainty evidence of benefit for fracture prevention and hot flashes, but very low-certainty evidence for cognitive outcomes. This is a nuanced, actionable picture that no individual SR provided.

EXAMPLE 3 · PSYCHIATRY / MENTAL HEALTH

Umbrella Reviews in Early Psychosis

Context: Paolo Fusar-Poli and Joaquim Radua, the authors of the landmark ‘Ten Simple Rules for Conducting Umbrella Reviews’, applied umbrella review methodology extensively in psychiatry, particularly around risk factors for and interventions in early psychosis.
Evidence stratification: Each risk factor for psychosis transition was classified into evidence tiers, allowing clinicians to quickly identify factors with convincing vs. weak evidence — critical for clinical risk calculators and early intervention programs.
These umbrella reviews demonstrated pre-specifying the protocol, defining variables of interest such as transition to psychosis as a clear binary outcome, estimating a common effect size (OR) across all SRMAs, and reporting the heterogeneity and 95% prediction intervals for each association.

EXAMPLE 4 · NUTRITION SCIENCE

Diet-Associated Inflammation and 38 Chronic Disease Outcomes

What it asked: What is the strength of evidence linking dietary inflammatory potential to 38 different chronic disease outcomes?
Efficiency demonstrated: This umbrella review was completed in approximately one year and assessed 38 chronic disease outcomes. This is a scope that would be entirely infeasible if starting from primary studies.
Limitations acknowledged: The authors explicitly noted that the review could not capture associations not yet covered by published meta-analyses, and that individual patient data analyses were not possible within the umbrella review framework.

EXAMPLE 5 · INFECTIOUS DISEASE / PUBLIC HEALTH

Long COVID Prevalence and Risk Factors (Rapid Systematic Umbrella Review)

Context: As the COVID-19 pandemic generated a rapid explosion of primary studies and systematic reviews on Long COVID, an umbrella review approach allowed researchers to synthesize the evidence at the review level.
Unique use case: The umbrella review explicitly used the systematic review level to examine common biases and limitations across the field — demonstrating how umbrella reviews can serve a methodological surveillance function, not just a substantive evidence synthesis one.
What it found: 14 reviews covering 5–196 primary studies were included. Pooling was not performed; instead, a descriptive meta-synthesis of prevalence estimates, risk factors, and bias patterns was conducted. This example illustrates that umbrella reviews can be qualitative/narrative as well as statistical.

Key Takeaways

An umbrella review is a systematic review of systematic reviews and/or meta-analyses; it synthesizes evidence at the highest available level, one step above individual SRMAs in the evidence hierarchy.
It is also called an ‘overview of reviews,’ ‘meta-review,’ or ‘review of reviews’. These terms are largely interchangeable in the current literature, though Cochrane prefers ‘overview of reviews.’
Umbrella reviews are appropriate only when a research topic is already well-covered by existing SRMAs. They cannot substitute for a primary systematic review where SRMAs do not yet exist.
The unit of analysis is the SRMA, not the primary study. But researchers typically re-run each meta-analysis using standardized statistical methods rather than simply reporting the published pooled estimates.
The two-part search algorithm (study design filter + topic filter, combined with AND) is the defining methodological feature that differentiates an umbrella review search from a standard systematic review search.
Overlap (the same primary studies appearing in multiple included SRMAs) is a key methodological challenge. The Corrected Covered Area (CCA) is the standard metric to quantify and report overlap.
AMSTAR-2 is the dominant tool for appraising the methodological quality of included SRMAs; ROBIS and JBI checklists are also used. Quality assessment is distinct from evidence grading.
GRADE is used to grade certainty of evidence for intervention umbrella reviews; the Ioannidis criteria are widely used for epidemiological umbrella reviews.
Pre-registration on PROSPERO and reporting adherent to PRISMA-OvR are the expected standards for publication-ready umbrella reviews.
Umbrella reviews inherit the biases and limitations of the SRMAs they include. A well-conducted umbrella review of low-quality SRMAs will still yield unreliable conclusions.
Umbrella reviews cannot generate new pooled effect sizes from primary data and are not designed for individual patient data (IPD) subgroup analyses.
The efficiency advantage of umbrella reviews is substantial: professional teams can often complete one in 6–10 weeks, compared to 12–24 months for a full de novo systematic review.

Frequently Asked Questions

Can I do an umbrella review if some of the systematic reviews I find are low quality? Do I have to exclude them?

This is one of the most commonly debated questions in umbrella review practice. The general methodological consensus is: include all eligible SRMAs regardless of quality, but clearly report quality ratings and conduct sensitivity analyses that restrict findings to higher-quality reviews.

Excluding low-quality SRMAs a priori risks distorting the evidence base: either by overestimating or underestimating effect sizes from a selectively curated subset. Instead, assess quality using AMSTAR-2 (or ROBIS/JBI), present the ratings transparently in a table, and discuss how the inclusion of lower-quality reviews may have influenced your overall conclusions. Sensitivity analyses restricted to ‘high’ and ‘moderate’ AMSTAR-2 ratings are the appropriate way to test whether your conclusions hold when low-quality reviews are removed.

If multiple systematic reviews all include the same studies, isn’t the ‘umbrella review’ just inflating the evidence base and showing me one study multiple times?

This concern is valid and points to the overlap problem, which is the most technically complex challenge in umbrella reviews. However, there are two important clarifications.

First, umbrella reviews do not statistically pool the results of different meta-analyses into a grand combined estimate. Each SRMA is analysed and reported separately. So the inflation risk is interpretive, not arithmetical. Second, the standard approach is to calculate the Corrected Covered Area (CCA). CCA values are classified as slight (<5%), moderate (5–10%), high (11–15%), or very high (>15%). This metric helps readers understand how much of the apparent evidence is truly independent: a critical caveat that should be prominently reported.

Do I need a full team to conduct an umbrella review, or can I do it alone? Is it really faster than a systematic review?

Unlike a standard systematic review (which mandates double-screening and double-extraction by at least two independent reviewers) some guidance documents note that a team is not strictly required for umbrella reviews. That said, best practice still calls for independent dual-screening to minimise selection bias, and independent data extraction for key statistics.

As for speed: yes, umbrella reviews are substantially faster than de novo systematic reviews. Professional teams can typically complete an umbrella review in 6–10 weeks. The time savings come from not having to search for, screen, and extract data from thousands of primary studies, only from a smaller pool of existing SRMAs. However, the statistical re-analysis step can add significant time for large umbrella reviews covering many associations.

Two of the systematic reviews I want to include reach completely opposite conclusions on the same question. What do I do?

Contradictory conclusions across SRMAs on the same question are not a failure. In fact, exposing and explaining these contradictions is one of the most valuable contributions an umbrella review can make.

Common explanations include: different eligibility criteria (populations, comparators, or outcome definitions); different search dates; different statistical models (fixed-effect vs. random-effects); different quality thresholds; and language or publication bias differences in search strategies.

Report the conflicting SRMAs side by side in your results tables, explain the likely sources of divergence in your discussion, and note what additional primary research would be needed to resolve the contradiction.

Can an umbrella review include non-intervention reviews, like reviews of prevalence, diagnostic accuracy, or qualitative studies?

Yes, though most of the established methodology has been developed and applied in the intervention and epidemiological association contexts, umbrella reviews are not inherently restricted to these domains.

Umbrella reviews of observational/epidemiological SRMAs are now very common. Umbrella reviews of diagnostic accuracy SRMAs are feasible but require specialised statistical methods (bivariate/SROC models). Umbrella reviews including qualitative systematic reviews are possible but remain methodologically less standardised. The JBI Manual for Evidence Synthesis provides specific guidance for umbrella reviews incorporating qualitative evidence alongside quantitative SRMAs.

When an umbrella review and a single systematic review both exist on the same question, which should I cite in my paper or guideline?

When both exist, the umbrella review is generally preferred as the citation for evidence-based decision-making because it

consolidates findings from multiple SRMAs,
addresses the overlap problem,
applies standardised quality assessment, and
provides a more complete and reliable picture of the evidence than any single SRMA can.

The important caveat: an umbrella review is only as good as the quality and completeness of the SRMAs it synthesises. If the best available SRMA covers more recent trials than an older umbrella review, citing the updated SRMA alongside the umbrella review may be appropriate until an updated umbrella review is published.

What is the difference between an umbrella review and a realist review?

An umbrella review synthesizes findings from multiple existing systematic reviews on the same topic, summarizing pooled effect estimates across reviews. It sits atop the evidence hierarchy and deals in aggregated verdicts. A realist review, by contrast, synthesizes primary studies (and other sources) through a theoretical lens, asking why and for whom effects occur by building CMO configurations. Umbrella reviews assume context is controlled or irrelevant; realist reviews treat context as central. Umbrella reviews produce a summary of “what works” verdicts; realist reviews produce explanatory theory about the conditions and mechanisms behind those effects.

References

Ten simple rules for conducting umbrella reviews. https://pmc.ncbi.nlm.nih.gov/articles/PMC10270421/
Types of Reviews: Umbrella Reviews. https://laneguides.stanford.edu/types-of-reviews/umbrella
Umbrella reviews: a methodological guide. https://academic.oup.com/eurjcn/article/24/6/996/7974731
How to Conduct Umbrella Review in Education? A Step-by-Step Methodological Guide Through a Case Study in Digital Diaries. https://journals.sagepub.com/doi/10.1177/20965311261421966

AI Summary

In an umbrella review (a review that synthesizes findings from multiple systematic reviews/meta-analyses on related topics), AMSTAR-2 is applied to each included systematic review to judge how trustworthy its conclusions are before they're pooled or compared.

Unique use case: The umbrella review explicitly used the systematic review level to examine common biases and limitations across the field — demonstrating how umbrella reviews can serve a methodological surveillance function, not just a substantive evidence synthesis one.

If the best available SRMA covers more recent trials than an older umbrella review, citing the updated SRMA alongside the umbrella review may be appropriate until an updated umbrella review is published.

Internal and external validity: Definition, differences, examples, threats

Mixed Methods Research: Definition, Guide, Designs, Tools, Examples