How to Do a Literature Search: A Practical Guide for Researchers

This article is in

Literature Search

Debraj Manna
May 14, 2026

Reading time

9 mins

How to Do a Literature Search: A Practical Guide for Researchers

Contents

What is a literature search?
Why the Purpose of Your Research Changes Everything
Step 1: Define Your Research Question
Step 2: Identify Search Terms
Step 3: Choose Your Databases
Step 4: Build and Run Your Search Strategy
Step 5: Apply Filters and Limits
Step 6: Manage Your Results
Step 7: Check for Missing Literature
Step 8: Document and Report Your Search
Common Mistakes to Avoid in Your Literature Search
Summary: Matching Search Rigour to Research Purpose
Frequently Asked Questions

What is a literature search?

A literature search is the systematic process of identifying, locating, and retrieving published works relevant to a research question. It is not the same as writing a literature review. The search comes first, and its quality determines everything that follows. A poorly executed search produces a biased, incomplete literature review whereas a well-executed one leads to an effective and comprehensive literature review. This guide walks you through the process step by step, with attention to how the purpose of your research shapes the depth and rigor of the search itself.

Why the Purpose of Your Research Changes Everything

Before running a single search, you must ask: Why am I doing this literature search? The answer fundamentally changes how comprehensive, documented, and reproducible your process needs to be.

Research Type	Primary Goal	Search Depth	Documentation Required	Typical Databases
Doctoral Thesis/Dissertation	Establish originality; map field comprehensively	Exhaustive	High, must demonstrate thorough coverage	Multiple: PubMed, Embase, Scopus, Web of Science + grey literature
Original Research Article	Justify gap and rationale	Focused	Moderate, enough to contextualize	2–3 core databases
Narrative Review	Synthesise themes and concepts	Broad, selective	Low-moderate	2–3 databases, expert suggestion
Systematic Review/Meta-analysis	Answer a specific clinical/policy question	Exhaustive, reproducible	Detailed and mandatory, PRISMA flow required	All major databases + trial registries

Doctoral Thesis or Dissertation

A doctoral literature search must demonstrate that you know your field well enough to identify a genuine gap. It is typically the broadest type of search, covering:

Foundational literature going back to seminal works, sometimes decades
Methodological literature: not just what has been studied, but how
Grey literature: conference abstracts, preprints, institutional reports, theses from other institutions
Adjacent disciplines: for example, a thesis on diabetes-related cognitive decline must cover both endocrinology and neuropsychology literature

The search is rarely conducted once. Doctoral candidates revisit and update the search as their research question sharpens, often running a final update search within 3–6 months of thesis submission.

Tools like R Discovery are particularly useful at this stage. R Discovery’s AI-powered feed learns from papers you mark as relevant and surfaces related work you may not have found through keyword searches alone: a significant advantage when you are trying to ensure breadth without becoming overwhelmed by volume. Its ability to recommend papers from across disciplines helps doctoral researchers working at the intersection of fields.

Original Research Article

When writing the introduction and discussion sections of an empirical paper, the literature search is focused rather than exhaustive. The goal is to:

Establish that the specific question has not been adequately answered
Cite the most current, high-quality evidence on key concepts
Place your findings in context with comparable studies

A targeted search of 2–3 major databases (e.g., PubMed plus Scopus) with a limited date range (commonly the past 5–10 years, with key landmark studies regardless of date) is usually sufficient. R Discovery supports this workflow through its daily paper recommendations and citation-chasing features, which allow researchers to quickly identify the 20–40 most cited and most recent papers on a topic without running lengthy manual searches.

Narrative Review

A narrative review synthesises literature thematically and does not require a fully reproducible search protocol. However, it still requires a structured search to avoid the criticism of cherry-picking. Best practices include:

Searching at least 4-5 databases
Documenting the search terms used, even if not to PRISMA standards
Be transparent about inclusion scope (e.g., English-language only, last 10 years)
Using expert consultation or reference list scanning to supplement

R Discovery is well-suited to narrative review workflows. Its recommendation engine: trained on your reading history: helps surface thematically related papers, and its collections feature lets you organise papers by theme before synthesis begins.

Systematic Review or Meta-analysis

This is the most rigorous form of literature search. It must be:

Pre-registered (on PROSPERO, for health-related reviews)
Reproducible: another researcher running the same search should get the same results
Comprehensive: missing key studies threatens the validity of pooled estimates
Documented in full PRISMA format with a flow diagram

The search must cover multiple databases (typically a minimum of PubMed, Embase, and the Cochrane Library for biomedical topics), plus trial registries such as ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform. Hand-searching key journals and contacting authors for unpublished data may also be required.

R Discovery plays a supporting rather than primary role here: it is ideal for keeping track of ongoing literature during a long review, flagging new publications that may need to be included in an updated search, and reading and annotating full texts during screening. Its ChatPDF feature allows you to quickly determine whether the study in question included the variables/outcomes you are interested in, before you dive into reading the full paper.

While no literature search can be outsourced to AI completely, the PRISMA guidelines (mandatory for systematic reviews and meta-analyses) make it clear that the researcher(s) must be in complete charge of the search process at every step.

Step 1: Define Your Research Question

Every literature search begins with a clearly formulated question. Structured frameworks help translate a broad topic into searchable concepts.

PICO (most common in clinical biomedicine):

Element	Meaning	Example
P	Population/Problem	Adults with type 2 diabetes
I	Intervention	GLP-1 receptor agonists
C	Comparator	Metformin
O	Outcome	Glycaemic control, weight loss

Other frameworks include:

SPIDER (qualitative research): Sample, Phenomenon of Interest, Design, Evaluation, Research type
PEO (qualitative/social): Population, Exposure, Outcome
ECLIPSE (health policy): Expectation, Client group, Location, Impact, Professionals, SErvice

Step 2: Identify Search Terms

From your PICO (or equivalent framework), extract keywords for each concept. Then expand each keyword with:

Synonyms: “myocardial infarction,” “heart attack,” “MI,” “AMI”
British/American spelling variants: “haematology” vs “hematology”
Acronyms and abbreviations: “T2DM,” “DM2,” “non-insulin-dependent diabetes”
Broader and narrower terms: from controlled vocabulary (see below)

Controlled Vocabulary vs Free Text

Type	Examples	When to Use
Controlled vocabulary (MeSH, Emtree)	“Neoplasms,” “Antineoplastic Agents”	Precise retrieval in indexed databases (PubMed, Embase)
Free text / keywords	“cancer,” “tumour,” “anti-cancer drug”	Catches new terms, preprints, grey literature
Combined	Both, joined with OR	Best practice: maximises sensitivity

In PubMed, MeSH (Medical Subject Headings) terms are assigned by indexers to each article. Searching “Heart Failure”[MeSH] retrieves all papers tagged with that heading, regardless of what word the authors used. Combine this with free-text keywords in the title and abstract fields using the [tiab] tag for maximum coverage.

R Discovery simplifies this step for less experienced researchers by allowing natural-language topic input and automatically mapping it to relevant papers: effectively handling some of the synonym expansion behind the scenes.

Step 3: Choose Your Databases

Database	Coverage	Best For
PubMed/MEDLINE	Biomedical, clinical, life sciences	First stop for all biomedical searches
Embase	Pharmacological, drug literature, European journals	Drug trials, adverse effects, European research
Cochrane Library	Systematic reviews, RCTs	Evidence-based clinical questions
Scopus	Multidisciplinary	Citation analysis, broad coverage
Web of Science	Multidisciplinary	Citation tracking, impact metrics
CINAHL	Nursing and allied health	Nursing, physiotherapy, nutrition
PsycINFO	Psychology, psychiatry	Mental health, behavioural science
ClinicalTrials.gov	Registered trials	Ongoing and unpublished trials (systematic reviews)
Google Scholar	Very broad, including grey literature	Supplementary; not for primary systematic searches

R Discovery aggregates content from across many of these sources and adds AI-powered curation on top. For researchers at institutions with limited database subscriptions, R Discovery’s ability to surface open-access papers and indicate full-text availability is particularly valuable.

Step 4: Build and Run Your Search Strategy

A search strategy combines your terms using Boolean operators:

OR: broadens the search; use within a concept (diabetes OR hyperglycaemia OR “high blood sugar”)
AND: narrows the search; use between concepts (diabetes AND metformin AND “cardiovascular outcomes”)
NOT: excludes terms (use sparingly; can inadvertently exclude relevant papers)

Other Search Techniques

Truncation: diabet* retrieves diabetes, diabetic, diabetics, diabetologist
Wildcards: wom?n retrieves woman and women
Phrase searching: “insulin resistance” in quotation marks retrieves the exact phrase
Field tags: limit to title/abstract, MeSH term, author, journal, publication year

Example Search Block (PubMed)

A search for GLP-1 receptor agonists and cardiovascular outcomes in type 2 diabetes might look like:

(“Glucagon-Like Peptide-1 Receptor”[MeSH] OR “GLP-1 receptor agonist*”[tiab] OR

semaglutide[tiab] OR liraglutide[tiab] OR dulaglutide[tiab])

AND

(“Diabetes Mellitus, Type 2″[MeSH] OR “type 2 diabetes”[tiab] OR “T2DM”[tiab])

AND

(“Cardiovascular Diseases”[MeSH] OR “cardiovascular outcome*”[tiab] OR “MACE”[tiab]

OR “major adverse cardiac event*”[tiab])

This kind of multi-concept Boolean block is the backbone of a rigorous search. For systematic reviews, it is typically peer-reviewed by a medical librarian using the PRESS checklist.

What is the PRESS checklist?

PRESS stands for Peer Review of Electronic Search Strategies. It is a structured tool used to peer review electronic literature search strategies, particularly for systematic reviews and other evidence syntheses.

The PRESS 2015 checklist covers six elements:

translation of the research question;
Boolean and proximity operators;
subject headings;
text word searching;
spelling, syntax, and line numbers; and
limits and filters.

For most original articles and narrative reviews, PRESS is not required. It is considered best practice (and increasingly mandatory by journals) for systematic reviews and health technology assessments. Peer review of the search strategy is recommended at the protocol phase, before searches are conducted and study selection begins.

Step 5: Apply Filters and Limits

Filters should be applied after running the initial search, not before: applying them too early can introduce bias.

Commonly used filters:

Date range: appropriate for rapidly changing fields (e.g., COVID-19 therapeutics) but not for conditions with longstanding evidence bases
Language: English-only is common but introduces language bias in systematic reviews
Study design: clinical trials, RCTs, meta-analyses (use with caution; not all study types are consistently indexed)
Species: human vs. animal studies (important in basic science searches)
Age group: paediatric, adult, elderly
Publication type: exclude editorials, letters, conference abstracts (for some purposes)

For systematic reviews, language and date restrictions should be explicitly justified in the methods section.

Step 6: Manage Your Results

Once you have your results, you need to:

Remove duplicates: the same paper often appears across multiple databases
Screen titles and abstracts: based on pre-defined inclusion and exclusion criteria
Retrieve and screen full texts: for papers that pass abstract screening
Track decisions: especially for systematic reviews (required for PRISMA flow)

Tools for Reference Management and Screening

Tool	Best Use
Zotero	Free reference manager; browser plugin; good for small-medium projects
EndNote	Institutional standard; powerful deduplication; good for large systematic reviews
Mendeley	Reference manager with PDF annotation
Rayyan	Free, purpose-built for systematic review screening
Covidence	Gold standard for systematic review management; subscription required
R Discovery	AI-powered reading feed; collections; full-text PDF access; annotation; ideal for ongoing monitoring

R Discovery deserves particular mention in the management phase. Its collections feature allows you to create named folders (e.g., “Included: full text reviewed,” “Excluded: wrong population”) and move papers between them. The built-in PDF reader with highlighting and note-taking means you can annotate directly within the platform without switching tools. For researchers who need to stay current with a field over months or years, R Discovery’s daily personalised feed ensures that newly published papers matching your interests are flagged automatically.

Step 7: Check for Missing Literature

Even after a thorough database search, important papers can be missed. Supplement your search with:

Reference list scanning (pearl growing): check the reference lists of key included papers
Citation chasing (forward searching): find papers that have cited your key papers (use Scopus, Web of Science, or Google Scholar)
Journal hand-searching: manually browse issues of the most relevant journals
Grey literature searching: conference proceedings, preprint servers (bioRxiv, medRxiv), WHO, FDA, NICE documents
Expert consultation: contact field experts to ask if key papers have been missed
Contacting authors: for unpublished data in systematic reviews

R Discovery supports citation chasing through its “related papers” and “citing papers” features, which surface both older foundational papers and newer papers that have built on a work of interest, without requiring a separate search in Scopus or Web of Science. It also covers a considerable amount of grey literature, with more than 5 million preprints and 7.5 million patents in its database.

Step 8: Document and Report Your Search

For All Research Types

At minimum, record:

The databases searched
The date of the search
The full search strategy (all terms and Boolean logic) for at least the primary database
The number of results from each database
The number of records after deduplication

For Systematic Reviews: PRISMA Flow Diagram

The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram is mandatory. It documents:

Records identified from database searching and other sources
Records after duplicate removal
Records screened and excluded at title/abstract stage (with reasons)
Full texts assessed and excluded (with reasons)
Studies included in final synthesis

Common Mistakes to Avoid in Your Literature Search

Searching only one database: even PubMed alone misses a significant proportion of relevant literature
Using only free-text keywords without MeSH terms: reduces sensitivity in indexed databases
Applying date filters too early: can exclude landmark papers
Not documenting the search: makes it impossible to update or reproduce
Confusing sensitivity and specificity: a very specific search finds fewer, more precise results; a very sensitive search casts a wider net but retrieves more noise. Systematic reviews prioritise sensitivity; targeted searches for original articles can be more specific
Searching once and stopping: literature searches for long projects (thesis, systematic review) must be updated before submission
Over-relying on a single tool: even powerful platforms like R Discovery are best used alongside formal database searches, not as a replacement for them in rigorous review contexts

Summary: Matching Search Rigour to Research Purpose

Feature	Doctoral Thesis	Original Article	Narrative Review	Systematic Review
Minimum databases	4–6	2–3	4–5	5+ including trial registries
MeSH/controlled vocabulary	Recommended	Recommended	Optional	Mandatory
Grey literature	Yes	Optional	Optional	Yes
Date restrictions	Justified	Common	Common	Justified only
PRISMA flow	Optional	Not required	Not required	Mandatory
Pre-registration	Not required	Not required	Optional	Strongly recommended
Search updates	Yes (ongoing)	At submission	At submission	Yes, before final analysis
R Discovery role	Daily feed, collections, reading	Recommendations, reading	Recommendations, themes	Monitoring updates, reading

A well-conducted literature search is an ongoing process of systematic discovery, careful documentation, and continuous updating. Whether you are a doctoral student mapping a field for the first time or a seasoned researcher conducting a meta-analysis, the principles remain the same: be systematic, be explicit, and let the evidence tell you what is there, not what you hoped to find.

Frequently Asked Questions

Can I use ChatGPT to find papers for my literature search?

Not as your primary search tool—and certainly not without verifying every single reference it generates. General-purpose large language models like ChatGPT do not search live databases; they generate text based on patterns in their training data. This means they can produce citations that look entirely plausible with correct journal format, realistic author names, and believable titles BUT these papers do not actually exist. A Deakin University study found that when ChatGPT was used to write mental health literature reviews, roughly one in five citations were completely fabricated, and more than half of all citations were either fake or contained errors. A psychiatry-focused test found that of 35 references generated by ChatGPT, only two were real, 12 were similar to actual manuscripts with incorrect details, and the remaining 21 were plausible-sounding composites of multiple real papers.

AI tools purpose-built for academic literature—such as R Discovery—are a different matter. These connect to real paper databases and surface verified records. R Discovery in particular is designed for this workflow, with AI recommendations grounded in actual indexed literature rather than generated text. Use it; do not use general chatbots as a substitute for database searching.

How do I know when I have searched enough and when can I stop?

If you keep seeing the same references appear repeatedly across different searches and databases, you have likely reached critical mass and can stop the retrospective search because you have found the existing relevant articles on your topic.

More formally, you are looking for a state sometimes called search saturation. This is the point at which new searches are no longer returning papers you haven’t already seen. Practical indicators include:

Running a new database search or variant search string yields fewer than 5% new unique results
Reference list scanning of your included papers keeps pointing back to papers already in your pile
Citation chasing forward from key papers surfaces no new relevant works

For systematic reviews, the stopping point is defined by protocol and date: you run the search, record it, and then update it once before final submission. For doctoral work, the search is never fully “done”; a final update run within 3–6 months of submission is standard practice. R Discovery’s daily feed is particularly useful here because it passively monitors the literature and flags new publications so you don’t have to keep re-running manual searches from scratch.

What do I do when I can’t access the full text of a paper I need?

Paywalls are a genuine barrier, but there are several legitimate routes before giving up:

Check for an open-access version first. Many authors self-archive their papers in institutional repositories or on ResearchGate. The browser extension Unpaywall automatically detects legal open-access copies of paywalled articles as you browse. Open Access Button is another tool that searches for freely available versions, or sends a request directly to the author if none is found.
Check preprint servers. bioRxiv and medRxiv host preprint versions of many biomedical papers, often the accepted manuscript before final journal formatting.
Email the corresponding author. Authors are almost always willing to share a PDF of their own work on request. This is completely legal and usually fast.
Use interlibrary loan (ILL). If you have institutional affiliation, your library can obtain papers from other institutions, usually within a few days.
R Discovery surfaces open-access versions of papers and indicates full-text availability directly within its interface, reducing time spent hunting across multiple platforms.

How do I know if a paper I want to cite has been retracted or flagged for problems?

Practical steps to keep away from spurious or unethical research are:

Check the Retraction Watch Database: a searchable database of retracted papers, now integrated into reference managers including EndNote and Zotero, which can flag retracted papers automatically in your library.
Check PubMed directly: retracted papers in PubMed carry a “Retraction of Publication” notice on the record.
Use PubPeer: a post-publication peer review platform where concerns about papers are flagged and discussed, often before formal retraction. RedacTek is another tool that flags articles with high self-citation rates and other markers of potentially problematic papers, available as a Chrome extension for use during active searching.
Be alert to red flags in the paper itself: unusual author combinations, figures that appear elsewhere online (reversible with a Google Lens image search), and papers that appear highly polished without conveying actual data or insights (See also: 6 ways to spot an AI-generated medical paper)
Look up unfamiliar journals: Signs of a predatory journal/hijacked journal that doesn’t properly peer review papers are: an extremely broad scope (e.g., Journal of Medicine, Biology, Engineering, and Technology), rapid publishing speeds, articles that look AI-generated, and articles that don’t meet basic scientific criteria.

My search is returning thousands of results. How do I manage this without reading everything?

A very large result set is usually a sign that your search strategy is too broad, not that you need to read everything, so you need to narrow first, then read:

Tighten the search by adding an additional AND concept, using more specific MeSH terms, or applying proximity operators (e.g., requiring that two terms appear within five words of each other rather than anywhere in the abstract)
Screen titles first: a title pass through several thousand records takes far less time than it sounds; most can be excluded in 2–3 seconds
Screen abstracts second: only for papers that passed the title screen (if you are doubtful after reading the abstract, use the ChatPDF feature in R Discovery to check if the paper really covers what you’re interested in)
Read full texts last: only for papers that passed the abstract screen

For systematic reviews, this two-stage screening process is mandatory and documented in the PRISMA flow. For other research types, even an informal version of this funnel prevents you from drowning in irrelevant material.

This article was originally published on October 5, 2023, and updated on May 14, 2026.

Author

Debraj Manna

Crafting engaging content about scientific research besides performing experiments on the bench

See more from Debraj Manna

Found this useful?

If so, share it with your fellow researchers

View Comments

Collaboration and networking Science Communication

How to Do a Literature Search: A Practical Guide for Researchers

What is a literature search?

Why the Purpose of Your Research Changes Everything

Doctoral Thesis or Dissertation

Original Research Article

Narrative Review

Systematic Review or Meta-analysis

Step 1: Define Your Research Question

PICO (most common in clinical biomedicine):

Step 2: Identify Search Terms

Step 3: Choose Your Databases

Step 4: Build and Run Your Search Strategy

Other Search Techniques

Example Search Block (PubMed)

What is the PRESS checklist?

Step 5: Apply Filters and Limits

Commonly used filters:

Step 6: Manage Your Results

Tools for Reference Management and Screening

Step 7: Check for Missing Literature

Step 8: Document and Report Your Search

For All Research Types

For Systematic Reviews: PRISMA Flow Diagram

Common Mistakes to Avoid in Your Literature Search

Summary: Matching Search Rigour to Research Purpose

Frequently Asked Questions

Can I use ChatGPT to find papers for my literature search?

How do I know when I have searched enough and when can I stop?

What do I do when I can’t access the full text of a paper I need?

How do I know if a paper I want to cite has been retracted or flagged for problems?

My search is returning thousands of results. How do I manage this without reading everything?

Author

Debraj Manna

Found this useful?

Related Reading

5 Tips to write a great literature review

A look at Sci-Hub’s current state and its impact on scholarly communication

6 Ways early career researchers can identify trending topics in their field of study

5 Things to consider when choosing a literature search tool

Navigating the maze of medical literature: A guide for early career researchers

What is a literature search?

Why the Purpose of Your Research Changes Everything

Doctoral Thesis or Dissertation

Original Research Article

Narrative Review

Systematic Review or Meta-analysis

Step 1: Define Your Research Question

PICO (most common in clinical biomedicine):

Step 2: Identify Search Terms

Step 3: Choose Your Databases

Step 4: Build and Run Your Search Strategy

Other Search Techniques

Example Search Block (PubMed)

What is the PRESS checklist?

Step 5: Apply Filters and Limits

Commonly used filters:

Step 6: Manage Your Results

Tools for Reference Management and Screening

Step 7: Check for Missing Literature

Step 8: Document and Report Your Search

For All Research Types

For Systematic Reviews: PRISMA Flow Diagram

Common Mistakes to Avoid in Your Literature Search

Summary: Matching Search Rigour to Research Purpose

Frequently Asked Questions

Can I use ChatGPT to find papers for my literature search?

How do I know when I have searched enough and when can I stop?

What do I do when I can’t access the full text of a paper I need?

How do I know if a paper I want to cite has been retracted or flagged for problems?

My search is returning thousands of results. How do I manage this without reading everything?

Author

Debraj Manna

Found this useful?

Related post

On the Record: Serendipity in Research –...

6 ways to spot an AI-generated medical...

Decoding AI Detection: How to Avoid AI...

Related Reading

5 Tips to write a great literature review

A look at Sci-Hub’s current state and its impact on scholarly communication

6 Ways early career researchers can identify trending topics in their field of study

5 Things to consider when choosing a literature search tool

Navigating the maze of medical literature: A guide for early career researchers

Filter by a topic

 6 Ways early career researchers can identify trending topics in their field of study