the data that was buried,
the papers that told only half the truth?
But when Cochrane researchers tried to verify the drug's benefits, they found that 60% of trial data had never been published.
After a 5-year battle, they obtained the hidden data. The conclusion changed: Tamiflu shortened symptoms by less than a day and did not prevent complications.
$9 billion spent on evidence that was never fully disclosed.
The Purpose of Synthesis
When Cochrane reviewers requested the full trial data to verify these claims, Roche refused for 5 years, citing "confidentiality." The company had conducted 10 treatment trials, but only 2 were fully published.
After relentless pressure, Clinical Study Reports were finally released in 2014. The picture changed dramatically: Tamiflu shortened symptoms by less than a day and showed no evidence of preventing hospitalizations or serious complications.
and the regulators knew,
but the published papers did not tell—
and billions were spent on a half-truth."
This is why we write meta-analyses—to find the whole truth.
you must enter into a covenant with your readers.
That covenant has a name:
PRISMA.
Title
Identify as systematic review, meta-analysis, or both
Abstract
Structured summary of the entire review
Introduction
Rationale and objectives with PICO
Methods
Protocol, search, selection, data, bias, synthesis
Results
Flow diagram, characteristics, risk of bias, synthesis results
Discussion
Summary, limitations, interpretation, implications
Other
Registration, funding, conflicts of interest
PRISMA's 27-item checklist changed everything. It required authors to document every step: the full search strategy, selection criteria, extraction methods, and synthesis decisions.
Today, over 10,000 journals endorse PRISMA. What was once exceptional transparency became the expected standard.
I will show you everything—
how I searched, what I found, what I excluded, why.
So you may judge my work, and trust—or question—my conclusions."
who changed the outcome after seeing the data,
who moved the goalposts until the results looked right?
Without a protocol, reviewers could:
• Change inclusion criteria after seeing results
• Switch primary outcomes to show significance
• Add or remove studies to change the conclusion
The protocol is your pre-commitment device— it prevents you from fooling yourself.
Where to Register Your Protocol
Essential Protocol Elements
Lock it in a public registry.
Then follow it—or explain why you deviated.
This is how you prove you did not cheat."
It must tell the reader:
What you studied, how you studied it, and what kind of study it is.
PRISMA Title Requirements
Problems: No population specified, no intervention, no outcome, doesn't say systematic review
Population, intervention, outcome, and study type all clear
Make it complete. Make it honest.
Tell them exactly what they will find within."
If the abstract lies, or omits, or misleads—
most readers will never know.
They found that 40% of abstracts contained "spin"— reporting that focused on secondary outcomes, subgroups, or within-group changes to make results appear more favorable than they were.
The abstract told a different story than the data.
PRISMA Abstract Checklist
What happens when the abstract tells a different story than the paper itself?
REAL DATA
Pitkin et al. (1999, BMJ) examined structured abstracts in six major journals and found that 18-68% of abstracts contained data inconsistent with the full article. Deficiencies ranged from numerical errors to conclusions not supported by the reported results.
If the primary outcome was null, say so.
The abstract must be a faithful mirror—
not a flattering portrait."
Before you search, before you write—
you must know exactly what you seek.
Structuring the Research Question
PICO:
• P: Adults diagnosed with major depressive disorder
• I: Supervised aerobic exercise (≥3x/week for ≥8 weeks)
• C: Usual care or waitlist control
• O: Change in depression score (HAM-D or BDI)
Now you know exactly what to search for.
Who are the patients? What is the treatment?
What is the comparator? What will you measure?
PICO is the map before the journey."
that searched only one database,
missed half the evidence,
and drew the wrong conclusion?
Where to Search
• Date of search for each database
• Any limits (language, date, publication type)
• Hand-searching (journals, conference proceedings)
• Contact with authors for unpublished data
The answer was alarming. They would have missed 30% of included studies—including some that changed the meta-analysis conclusions entirely.
One striking example: an anti-depressant meta-analysis showed benefit when based on MEDLINE alone, but no benefit when all sources were included. The missing studies were smaller, negative trials indexed in specialty databases like EMBASE and PsycINFO.
Document every database, every date, every term.
The evidence you miss may be the evidence that matters most."
But choose by what rule?
And who will check your choices?
PRISMA 2020 Flow Diagram
Who Selects? How?
Can a post-hoc subgroup analysis from a single trial reshape an entire field for a decade?
REAL DATA
The Women's Health Initiative (WHI, 2002) found that HRT increased cardiovascular risk overall. But post-hoc subgroup analysis suggested women aged <60 or within 10 years of menopause might benefit, while older women were harmed. This "timing hypothesis" fueled years of debate and further studies.
Every reason must be documented.
Two pairs of eyes are better than one—
for what one misses, the other may catch."
that pooled good studies with bad,
and called the average truth?
In the published literature: 94% of trials were positive.
In the FDA database: only 51% were positive.
The published meta-analyses had pooled selectively reported data. The effect size was inflated by 32%.
Which Tool to Use?
Cochrane Risk of Bias 2.0 for RCTs
yields a biased conclusion—
with a narrower confidence interval.
You have made the lie more precise."
Extract wrong, and your whole analysis
is built on sand.
Essential Data Items
Handling Missing Data
But later scrutiny revealed complications. Some effect estimates had been extracted from secondary publications rather than primary trial reports. Small differences in how events were counted—extracted from different sources—meaningfully changed the results.
The meta-analysis was influential and largely correct, but the controversy highlighted how small extraction decisions can have billion-dollar consequences. Merck's competing drug gained market share; GSK faced massive litigation.
One digit wrong can change the conclusion.
The extraction form is your ledger—
keep it meticulous, keep it true."
Choose the wrong measure,
and your pooled estimate will be meaningless.
Choosing the Right Effect Size
Multiplicative
Case-control
Same units
Different scales
Can a trial that transforms global practice still have serious limitations?
REAL DATA
The RECOVERY trial (2020) demonstrated that dexamethasone reduced 28-day mortality in hospitalized COVID-19 patients requiring oxygen: RR 0.83, 95% CI 0.75-0.93. Yet the trial was open-label (no blinding), conducted predominantly in UK hospitals, and the control group received usual care (which varied).
Risk ratios for common outcomes, odds ratios for rare.
Standardize when scales differ.
The wrong measure pools apples with oranges."
where studies pointed in opposite directions,
yet the diamond declared a single truth?
Which Model to Use?
Do Not Meta-Analyze If...
What happens when a methodological critique of a Cochrane review escalates into an organizational crisis?
REAL DATA
In 2018, Peter Gøtzsche and colleagues published a critique of the Cochrane HPV vaccine review, arguing it had excluded key trials and used inappropriate inclusion criteria. The Cochrane review had included 26 studies with over 73,000 women. The dispute became a governance crisis, culminating in Gøtzsche's expulsion from Cochrane's board.
A meta-analysis of incompatible studies
is not synthesis—it is confusion.
Know when to say: these cannot be combined."
the disagreement itself is data.
Do not hide it. Explain it.
Significance test
% variation
Between-study var
Future studies
When I² > 50%
What if a meta-analysis of small positive trials is overturned by a single mega-trial?
REAL DATA
By the early 1990s, several small trials suggested intravenous magnesium reduced mortality after acute myocardial infarction. A meta-analysis (Teo et al., 1991) pooled these and found a significant benefit: OR 0.44, 95% CI 0.27-0.71. Then ISIS-4 (1995), a mega-trial with 58,050 patients, found no benefit at all. The small-study effects and heterogeneity had been ignored.
It is a question: Why do these studies disagree?
Investigate. Explain. Or acknowledge ignorance."
where negative studies go to die,
leaving only the positive survivors
to tell a distorted story?
Internal company documents showed Merck knew of cardiovascular risks but suppressed unfavorable data and published only favorable analyses.
A meta-analysis using all available data revealed a 2-fold increased risk of heart attack.
Vioxx was withdrawn. It had caused an estimated 88,000-140,000 excess heart attacks.
Assessment Methods
• Search trial registries (ClinicalTrials.gov, WHO ICTRP)
• Contact companies for unpublished data
• Cite registration numbers in your review
• Report which registered trials are missing from your analysis
It holds the studies that companies hid,
the results that journals rejected.
Your job is to open that drawer—or say you could not."
It shows the reader everything:
each study, each weight, each confidence interval,
and the final pooled estimate.
Elements of the Forest Plot
What to Include
Then came the APPROVe trial. When its data was added to the forest plot, the picture changed dramatically. APPROVe's large square pulled the pooled diamond definitively toward harm. The visual was unmistakable.
That forest plot ended Vioxx. Merck withdrew the drug voluntarily. The subsequent litigation cost the company $4.85 billion in settlements. Thousands of patients had suffered heart attacks while the earlier, smaller trials showed ambiguous results.
Every study visible. Every weight transparent.
Let the reader see what you saw—
and judge for themselves."
You must also tell the reader:
How confident should they be in this result?
Rating the Evidence
Very confident
Likely close
May differ
Uncertain
What happens when a GRADE assessment of "low certainty" collides with a public health emergency?
REAL DATA
The 2023 Cochrane review of physical interventions to reduce respiratory virus spread (Jefferson et al.) found that the evidence for masks in community settings was low certainty per GRADE, with wide confidence intervals. The review was widely reported as proving "masks don't work," though the authors stated the evidence was insufficient to draw firm conclusions in either direction.
GRADE certainty is the how sure.
Report both—or the reader cannot judge
how much to trust your conclusion."
Not to spin. Not to overstate.
But to explain what your findings mean—
and what they do not mean.
Summary of Findings
Restate main results with certainty rating
Comparison with Existing Literature
How do your findings relate to prior reviews?
Strengths and Limitations
Both of the review AND of the included studies
Implications for Practice
What should clinicians/policymakers do?
Implications for Research
What studies are still needed?
What NOT to Do
What if the most-cited methodology paper ever published warns that most research findings are false?
REAL DATA
John Ioannidis's 2005 paper in PLoS Medicine, "Why Most Published Research Findings Are False," has been cited over 10,000 times. Using mathematical modeling, he argued that the probability a research finding is true depends on study power, bias, and the number of tested relationships. For many research designs, the post-study probability of a true finding can be below 50%.
It is for honest interpretation.
Say what the evidence shows.
Admit what it does not show."
where conflicts of interest were hidden,
where data was fabricated,
and millions of children went unvaccinated?
He did not disclose that he was paid £435,643 by lawyers seeking to sue vaccine manufacturers.
He did not disclose that he had filed a patent for a competing single-dose measles vaccine.
The study was eventually retracted. Wakefield was struck off. But the damage was done: vaccination rates plummeted, and measles outbreaks returned.
What to Declare
The campaign gathered over 90,000 individual signatories and 700+ organizations. It demanded that all past and future trials be registered, with full methods and results reported.
The impact was transformative. The EU now requires trial registration and results reporting. The FDA strengthened its own requirements. Journals began demanding prospective registration. What started as advocacy became global policy.
Declare your funding. Declare your conflicts.
The reader has a right to know
who paid for this work—and why."
Have You...
• List of excluded studies with reasons
• Data extraction forms (blank and completed)
• Risk of bias details for each study
• Additional forest plots (subgroups, sensitivity)
• Funnel plot and statistical tests
• GRADE evidence profile tables
How long can a fraudulent paper survive peer review, editorial scrutiny, and public challenge?
REAL DATA
Andrew Wakefield's 1998 Lancet paper linking MMR vaccine to autism took 12 years to be fully retracted (2010). During that time, journalist Brian Deer uncovered financial conflicts, ethical violations, and data manipulation. Multiple large studies (including a Danish cohort of over 650,000 children) found no association, yet the original paper's influence persisted.
You have weighed it fairly.
You have written it transparently.
Now submit your work—
and let truth be found, and found again."
Key Sources
- Page MJ et al. BMJ. 2021;372:n71. [PRISMA 2020]
- Jefferson T et al. Cochrane 2014;4:CD008965. [Tamiflu]
- Turner EH et al. N Engl J Med. 2008;358:252-260. [Antidepressants]
- Boutron I et al. JAMA. 2010;303:2058-2064. [Spin]
- Topol EJ. N Engl J Med. 2004;351:1707-1709. [Vioxx]
- Deer B. BMJ. 2011;342:c5347. [Wakefield]
- Sterne JAC et al. BMJ. 2019;366:l4898. [RoB 2]
- Higgins JPT et al. Cochrane Handbook. 2023.
- Schünemann HJ et al. GRADE Handbook. 2013.
- Ioannidis JPA. PLoS Med. 2005;2:e124. [Why most research is false]
Register before you search.
Search comprehensively. Select transparently.
Extract carefully. Assess bias honestly.
Pool wisely—or not at all.
Write so that truth may be found,
and found again, by those who follow."
Which software will carry your analysis
from protocol to forest plot?
Choosing Your Tools
Free download
Reproducible code
SoF tables
AI-assisted
# Calculate effect sizes
dat <- escalc(measure="RR", ai=events_tx, bi=noevents_tx,
ci=events_ctrl, di=noevents_ctrl, data=mydata)
# Random effects model
res <- rma(yi, vi, data=dat, method="REML")
# Forest plot
forest(res, slab=paste(author, year))
How do you coordinate writing when a systematic review has dozens of authors?
REAL DATA
The SPRINT trial (2015) listed over 100 authors from dozens of institutions. The writing group included a steering committee, site investigators, and statisticians. Coordinating contributions, managing version control, and determining authorship credit required formal structures. ICMJE criteria define authorship as requiring substantial contribution, drafting or revision, final approval, and accountability.
The analyst does.
But choose your tool wisely—
and share your code so truth can be verified."
But every clinician, every policymaker, every patient
must know how to read them.
Meta-analyses of these studies showed a 35-50% reduction in cardiovascular risk.
Then the WHI randomized trial revealed the truth: HRT increased heart attack risk by 29%.
The observational meta-analyses had pooled confounded data— healthier women chose HRT, not the reverse.
Consumer's Guide
Warning Signs in Published Meta-Analyses
MODERATE: Probably close to truth. Future research may change estimate.
LOW: Uncertain. Future research likely to change substantially.
VERY LOW: Very uncertain. Any estimate is speculative.
Look for the protocol. Check the bias.
Ask: Who funded this? What did they hide?
The informed reader is the guardian of truth."
that were never tested head-to-head?
This is the realm of network meta-analysis.
NMA Decision Tree
Nodes = Treatments (size = sample)
Edges = Direct comparisons (width = studies)
Dashed = Indirect evidence only
• Each cell: effect estimate + 95% CI
• Row vs. Column: Treatment A vs. Treatment B
• Green = Favors row treatment
• Red = Favors column treatment
• Rankings (SUCRA/P-score) help identify best options
What happens to meta-analyses when a prolific author's entire body of work is retracted?
REAL DATA
Joachim Boldt, a German anesthesiologist, had over 220 papers retracted for data fabrication (discovered 2010-2011). His studies on colloid solutions had been included in multiple systematic reviews and meta-analyses. When the retractions came, every meta-analysis containing his work had to be re-evaluated. Some conclusions changed substantially when his fabricated data was removed.
the network builds a bridge of evidence.
But the bridge rests on transitivity—
verify that the populations are comparable."
with 99% accuracy in the training set—
and failed catastrophically
when deployed in the real world?
Internal validation showed excellent performance.
But an independent study at Michigan Medicine found the model missed 67% of sepsis cases and generated excessive false alarms.
The algorithm had been validated on the same population it was trained on— a recipe for overfitting and failure.
Levels of Evidence for AI/ML
Risk of Bias
guideline
extension
Calibration: Are predicted probabilities accurate?
A model can have good AUC but poor calibration—and harm patients.
and the data was biased.
It validated on itself,
and called its reflection truth.
External validation is not optional—it is survival."
But the patient hears in fears and hopes.
How will you bridge the gap?
Communication Decision Tree
It found that this treatment reduces the risk of [outcome] by about 30%.
In practical terms: if we treat 100 people like you, about 5 fewer will have [outcome] compared to no treatment.
We're moderately confident in this—future research might change it slightly.
What questions do you have about this?"
Empowering Patients
Can a spreadsheet error in an academic paper directly shape the economic policy of entire nations?
REAL DATA
Reinhart and Rogoff's 2010 paper claimed that countries with public debt exceeding 90% of GDP experienced dramatically lower growth. This finding was widely cited to justify austerity policies across Europe. In 2013, Herndon, Ash, and Pollin discovered a spreadsheet error: several countries had been accidentally excluded from the calculations. After correction, the sharp 90% threshold disappeared.
The patient hears in fears and hopes.
Your job is to be the translator—
faithful to the evidence, compassionate to the person."
But science does not stop.
How do we keep the evidence alive?
Living systematic reviews were continuously updated as new trials reported— sometimes within days of publication.
The COVID-NMA consortium produced living reviews on treatments, vaccines, and diagnostics, updating recommendations in real-time as the evidence evolved.
Hydroxychloroquine went from "promising" to "ineffective" within months.
Living Review Decision Tree
screening
data pooling
evidence
+ MA
It grows with each new study, each new question.
Keep your reviews alive.
Keep your methods transparent.
Keep truth at the center of all you do."