Research Integrity | 3 min read

Understanding Citation Ethics: Why You Should Never Rely Solely on AI for Literature Discovery

By Richard Murphy Updated on: May 8, 2026

Understanding Citation Ethics: Why You Should Never Rely Solely on AI for Literature Discovery

Recent evaluations of generative AI show a worrying pattern: many AI systems produce plausible-looking but incorrect or entirely fabricated bibliographic references. In one multi-model study of academic bibliographic retrieval, only 26.5% of generated references were entirely correct, while nearly 40% were erroneous or fabricated.

For researchers, students, and institutional authors, this matters because literature discovery and accurate citation underpin reproducibility, peer review, and scholarly trust. This article explains what goes wrong when you rely solely on AI for literature discovery, why those failures occur, and most importantly practical, implementable workflows and checks you can use to preserve research integrity.

Benefits of using AI in literature discovery

These strengths make AI a useful assistant but not a substitute for rigorous literature discovery.

Risks of relying solely on AI

How AI hallucinations happen

AI language models are pattern predictors: they generate plausible text given a prompt, but they do not “retrieve” verified bibliographic records in the way a database does. When asked for citations, models may invent titles, DOIs, or journal names that fit learned patterns. Retrieval-augmented approaches (RAG) can reduce this risk but do not eliminate it.

Practical, step-by-step workflow

Research Integrity

Submit with complete integrity — every time.

Powered by iThenticate and checked against 47 billion web pages, 190 million paywalled articles, and 200+ million open access works — the most comprehensive check available before submission.

Get Plagiarism Report →
  1. Use AI for brainstorming—not for sourcing
    • Ask AI to suggest keywords, synonyms, and broader search terms to inform database queries. Verify every specific reference yourself.
  2. Search primary bibliographic databases first
    • Perform structured searches in discipline-appropriate databases (PubMed/Medline, Scopus, Web of Science, IEEE Xplore, Google Scholar) and record your search strings and date ranges. Avoid treating AI output as a primary search result.
  3. Treat AI-recommended references as leads, not authorities
    • If AI provides a citation (title, DOI, authors), independently verify the DOI, publisher, and full text via the relevant database or the publisher site before citing.
  4. Use a verification checklist for every new reference:
    • Confirm DOI resolves to the correct article.
    • Verify author names, journal, volume, pages, and year in CrossRef/Google Scholar.
    • Access the abstract or full text to ensure the article supports your claim.
    • Flag any mismatch and remove fabricated or unverifiable items.
  5. Combine AI with structured, reproducible review methods
    • For systematic reviews, document your protocol and follow PRISMA guidelines for search, selection, and reporting. This preserves transparency and mitigates propagation of AI errors.
  6. Use retrieval-augmented tools cautiously.
    • Tools built to combine LLMs with database retrieval can reduce hallucinations but are not foolproof; continue human validation.

Common mistakes to avoid

Next steps

As you conduct your next literature search, be sure to implement a verification checklist. If you’re preparing a systematic review, remember to register your protocol (e.g., PROSPERO, where applicable), follow PRISMA guidelines, and collaborate with a librarian or information specialist. If you need editorial or bibliographic support, check out our Literature Search and Citation Service and our AI assistant on literature discovery.

    Enjoying this article?

    Get more publishing tips and research insights delivered weekly.

    Join 30,000+ researchers

    Enago’s manuscript services help researchers ensure clarity, proper citation formatting, and adherence to reporting guidelines, including those for systematic reviews. Our expert editors can review your bibliography for consistency, check citation formats, and provide guidance on best practices for reporting, ensuring your submission meets journal standards.

    Frequently Asked Questions

    AI hallucinations occur when generative AI systems produce plausible-sounding but fabricated or incorrect information, including fake citations, non-existent DOIs, and invented journal articles. In academic research, these hallucinations undermine reproducibility and scholarly trust. Multi-model studies show that nearly 40% of AI-generated references contain errors or complete fabrications, with only 26.5% being entirely correct, making verification essential for maintaining research integrity.

    AI citation accuracy varies significantly by topic and recency. A comprehensive multi-model study found only 26.5% of generated references were entirely correct, while approximately 40% were erroneous or fabricated. Domain-specific evaluations reveal further concerns: a nephrology-focused study discovered only 62% of ChatGPT's suggested references actually existed, with 31% being fabricated or incomplete. Hallucination rates increase substantially for newer or niche topics where training data is limited.

    Researchers should implement a systematic verification checklist for every AI-suggested reference: confirm the DOI resolves to the correct article through CrossRef or publisher websites, verify all metadata including author names, journal title, volume, pages, and publication year in primary databases like PubMed or Web of Science, access and review the abstract or full text to ensure content supports your claim, and remove any unverifiable items immediately from your bibliography.

    AI language models are pattern predictors, not bibliographic databases. They generate text that appears plausible based on learned patterns from training data, but they don't retrieve verified records. When prompted for citations, models may invent titles, DOIs, authors, or journal names that fit statistically likely patterns without confirming actual existence. Retrieval-augmented generation (RAG) approaches can reduce this risk by connecting models to real databases, but they don't eliminate hallucination entirely.

    The safest approach uses AI for brainstorming keywords and search terms only, not for sourcing citations. Conduct structured searches in discipline-specific databases like PubMed, Scopus, Web of Science, or IEEE Xplore first, documenting search strings and date ranges. Treat any AI-recommended references as unverified leads requiring independent confirmation through primary databases. For systematic reviews, register your protocol with PROSPERO, follow PRISMA reporting guidelines, and collaborate with information specialists to ensure transparency and reproducibility.

    AI can serve as a supplementary brainstorming tool for systematic reviews but should never replace structured, reproducible methodology. Researchers must follow established protocols like PRISMA guidelines, register review protocols in appropriate registries such as PROSPERO, conduct searches in primary bibliographic databases, and maintain detailed documentation of search strategies. AI-assisted screening may reduce workload, but human validation of every citation, inclusion decision, and data extraction step remains essential for maintaining systematic review quality and research integrity.

    Subscribe
    Notify of
    guest
    0 Comments
    Inline Feedbacks
    View all comments

    You Might Also Like

    Caught or Not: Why Some AI-Generated Papers Are Exposed While Others Slip Through the Cracks

    Caught or Not: Why Some AI-Generated Papers Are Exposed While Others Slip Through the Cracks

    The arrival of powerful large language models (LLMs) has changed scholarly writing and posed new...

    RM
    Richard Murphy 5 min
    The Geography of Prestige: Institutional and Regional Bias in Top Journals

    The Geography of Prestige: Institutional and Regional Bias in Top Journals

    A bibliometric analysis of 10,558 original research articles published in five leading medical journals (NEJM,...

    RM
    Richard Murphy 5 min
    AI-Powered Paper Mills: The New Threat to Research Integrity

    AI-Powered Paper Mills: The New Threat to Research Integrity

    A recent landscape study found more than 32,700 suspected fake papers linked to organised “paper...

    RM
    Richard Murphy 4 min

    Never miss an insight.

    Get the latest research writing tips delivered weekly.

      Join 30,000+ researchers · Unsubscribe anytime