can be reduced to a single number,
在这个数字中, 生命被抹去?
但是哪些患者受益?年轻人还是老年人?是轻症还是重症?男性还是女性?
总体无法回答。
在平均水平内,一些患者获救,而另一些则受到伤害。
● Responders ● Non-responders
But who 是 30%?如果没有个人数据,我们就无法识别响应者和非响应者。我们不能练习 precision medicine.
不总结。不是平均值。 Each patient, each tumor, each outcome.
What he discovered changed breast cancer treatment forever.
"Tamoxifen works."
But the individual data revealed:
Giving tamoxifen to ER− patients was useless.
在个体中,真相出现了。
这就是我们寻找隐藏患者的原因。“
这是个体参与者数据荟萃分析。
What is found when we look closer?
Aggregate Data (AD)
- Study-level summaries
- 出版物中的影响大小
- Mean age, % male, etc.
- 快速且可访问
- Cannot see within-study variation
个人参与者数据(IPD)
- 患者层面的原始数据
- Every participant's characteristics
- Actual ages, actual outcomes
- Time-intensive to obtain
- Can see who responds and who doesn't
Trial A: Mean age 55 years, Effect size 0.70 (benefit)
Trial B: Mean age 75 years, Effect size 0.90 (less benefit)
Tempting conclusion: "The drug works better in younger patients."
But this could be completely wrong.
Perhaps in Trial A, older patients within that trial responded better. Perhaps in Trial B, younger patients within that trial responded better.
如果没有个体,你就无法知道数据。
True Effect Modification
Test whether treatment effect varies by patient characteristics (age, biomarkers, disease severity)
Time-to-Event Analysis
Use actual survival curves, not just hazard ratios. Handle censoring properly.
Consistent Definitions
Standardize outcome definitions, exposure timing, covariate categories across studies
Subgroup Credibility
测试研究中的相互作用,避免生态谬误
Individual data shows each tree.
如果您需要知道哪些树生病了—
you must walk among them."
gathered data on 170,000 patients
and answered questions no single trial could ask?
27 trials. 174,149 patients. Every baseline characteristic. Every cardiovascular event. Every death.
The published trials asked: "Do statins work?"
收集个人数据 CTT 问道: “为谁做他汀类药物有效吗?”
但是您可以回答的问题 值得投资.
Benefit Proportional to LDL Reduction
Every 1 mmol/L LDL reduction = 22% lower CV events. True across all subgroups.
No Age Threshold
Benefit continues even in patients >75 years (contradicting earlier AD analyses)
Primary Prevention Works
Patients without prior CVD benefit proportionally to their baseline risk
No Cancer Signal
关于他汀类药物致癌的担忧已被 IPD 明确驳斥后续
Trial A: "High risk" = 10-year CVD risk >20%
Trial B: "High risk" = Prior MI
Trial C: "High risk" = Diabetes
You cannot compare or combine what is defined differently.
IPD allowed the CTT to redefine everyone consistently.
IPD提供翻译。
What seemed contradictory becomes clear:
the same truth, measured differently."
聚合数据需要 weeks.
什么时候值得投资
您应该追求 IPD 吗?
Time-to-Event Outcomes
当生存曲线重要而不仅仅是最终风险比时当您需要处理审查时。
Continuous Effect Modifiers
Testing whether treatment effect varies by age, BMI, biomarker level (not just "high" vs "low")
Outcome Definition Problems
当试验以不同的方式定义结果并且您需要标准化
Longer Follow-Up Available
当试验者有未发表的后续数据时,您希望包括
您的关键问题是:“益处是否因疾病严重程度和治疗时间而异?”
Overall Treatment Effect
当您唯一的问题是“它有效吗?”不是“为谁?”
Homogeneous Population
When trials enrolled similar patients and effect modification is unlikely
Binary Outcomes, Short Follow-up
当审查不是问题并且结果很简单是/否时
IPD Unobtainable
当审判者不共享、数据丢失或资源不可用时
但每个问题关于 which individuals
要求它们存在于您的数据中。”
现在:您是否将其分析为 one combined dataset
or trial by trial, then combine?
Two-Stage Approach
- Stage 1: Analyze each trial separately
- 第2阶段:对结果进行元分析
- Preserves trial structure
- Familiar (like standard MA)
- Cannot handle sparse data well
One-Stage Approach
- 同时分析所有数据
- Mixed-effects regression model
- 聚类的随机效应
- 更适合稀疏数据
- More flexible modeling
Each trial's estimate is transparent.
熟悉的森林图和我2 statistics.
2. More powerful for detecting interactions
3. Can model complex covariate relationships
4. Exact likelihood (no normal approximations)
One-Stage or Two-Stage?
The one-stage hears all voices at once.
当数据稀疏且事件罕见时 -
一级捕获两级错过的内容。”
的故事吗——
but only if given at the right time?
But a puzzle remained: When should they be given?
24 hours before birth? 48 hours? A week?
Published trials couldn't answer—they didn't report timing consistently.
Maximum benefit: 24 hours to 7 days before birth
Reduced benefit: >7 days (lung maturity effect fades)
No benefit: <24 hours (not enough time to work)
After IPD: Guidelines now recommend repeat dosing 如果在第一个疗程后 7 天内未发生分娩。
没有个体数据就不可能实现这种精确度 saved thousands of premature babies.
Trial B reported: "Steroid given antenatally"
Trial C reported: "Steroid-to-delivery interval: median 3 days"
Different categories. Different definitions. Incompatible summaries.
Only by examining each baby's actual steroid-to-delivery time could the optimal window be identified.
But when 如何给予它是未知的。
IPD 将“某个时间”变成“正确的时间”——
在这种精确度下,孩子们”
在文件和数据库的某个地方,
每个患者的故事都被记录.
问题是:他们会分享它吗?
This is expected. Plan for it.
Trialist Collaboration
Direct contact with trial investigators. Build relationships. Offer co-authorship.
数据共享平台
YODA Project, ClinicalStudyDataRequest.com, Vivli, ICPSR
Regulatory Agencies
EMA Policy 0070, FDA (limited), Health Canada
Journal Requirements
Many journals now require data sharing; check supplementary materials
2. Offer co-authorship—make sharing worthwhile
3。描述数据安全—how you'll protect their patients
4.提供数据字典—specify exactly what you need
5. Set clear timelines—respect their time
Industry trials: less likely to share
Negative trials: less likely to share
Older trials: data may be lost
Your IPD sample may be biased.
Will they share what they have guarded?
搭建桥梁小心——
因为在那座桥上,患者的未来交叉。“
from twelve trials, five countries, three decades.
But Trial A calls it "cardiovascular death"
和试验B称之为 "cardiac mortality".
它们相同吗?
Trial from USA: Age in decimal years (e.g., 65.7)
来自英国的审判:年龄范围("65-74")
Diabetes: HbA1c ≥ 6.5% vs. fasting glucose ≥ 126 vs. "physician diagnosis"
Outcome: "Major adverse cardiac event" (one trial includes stroke, another doesn't)
分析之前:协调一切。
创建主数据字典
Define every variable you need: name, type, permitted values, derivation rules
Map Each Trial's Variables
Document how each trial's coding maps to your standardized definitions
检查试用者
验证您的解释。他们比您更了解自己的数据。
Validate Transformations
重现 IPD 发布的结果。如果它们不匹配,请进行调查。
六项试验使用了标准定义(ICD 代码)。 20 世纪 90 年代的两项试验使用了“研究者评估的心源性死亡”,但没有标准化标准。这两项试验显示出更大的治疗效果。
从 IPD 重现每个试验已发布的结果。
如果您的分析给出 RR = 0.78,但出版物说 RR = 0.85,
something is wrong.
Find the discrepancy. Fix it. Then proceed.
different ways to name the same disease.
在合并之前,您必须翻译。
在翻译之前,您必须了解。”
但是它对于以下情况是否同样有效年轻人和老年人?
对于轻度和重度?
对于有生物标志物的人和没有生物标志物的人?
They asked: "Does the effect differ by estrogen receptor status?"
相互作用是巨大的:
ER-positive: 47% reduction in recurrence
ER-negative: No benefit at all
治疗效果 differs between subgroups defined by the covariate.
如果不是:治疗效果为 similar across subgroups (or you lack power to detect a difference).
Between-Study Interaction
- Compares trial-level averages
- Ecological fallacy risk
- Confounded by trial design
- Low statistical power
- 可以使用聚合数据
Within-Study Interaction
- Compares patients within each trial
- No ecological fallacy
- Randomization preserved
- Much higher power
- Requires IPD
这是效果的黄金标准修改。
您在分析中测试了 8 种潜在的效果调节剂。
The interaction reveals who benefits.
在研究中进行测试,而不是在—
之间进行测试,因为这才是发现真相的地方。”
当您寻求 predict who will die,
who will recover, who will relapse—
个人就是一切.
"Will they ever wake up?"
Individual trials were too small to develop accurate prediction models.
IMPACT 从 11 项研究、9,205 名患者中收集了 IPD,并建立了一个模型,根据初始临床特征预测 6 个月的结果。
External Validation Across Populations
Develop in some studies, validate in others. True test of generalizability.
Non-linear Relationships
Explore how predictors relate to outcome: linear? threshold? U-shaped?
Multiple Predictor Interactions
Age + GCS + pupil reactivity may interact in ways aggregate data cannot reveal
Proper Handling of Missing Predictors
患者层面而非研究层面的多重插补
TRIPOD
- Prediction model reporting
- 开发和验证
- 校准和辨别
- 22 item checklist
PRISMA-IPD
- IPD meta-analysis reporting
- 数据采集详细信息
- Harmonization process
- Integrity checking
当患有创伤性脑损伤的患者到达时,模型会提供 probability of survival and probability of favorable outcome.
这指导与家人的对话。这告知治疗强度。这有助于分配 ICU 资源。
根据个人数据构建。为个体患者服务。
您必须从数千个过去中学习。
IPD 拥有这些故事 —
each one a teacher, if you will listen."
送给他们心脏的白色小药片 millions ?
他们被告知: “拿着这个,你就会受保护。”
But was every heart equally in need of protection?
Researchers quickly re-identified users by matching ratings with public IMDb reviews.
One lawsuit alleged a closeted lesbian was outed through her viewing patterns.
The lesson: removing names isn't anonymization.
IPD 包含年龄、诊断、治疗反应和日期的足够组合,以唯一地识别个人。隐私需要的不仅仅是删除姓名列,还需要了解数据组合如何成为指纹。
The published trials said: "Aspirin prevents heart attacks."
收集个人数据,但 ATT 提出了禁止的问题:
“代价是什么?以及谁?”
2-3 heart attacks prevented
2-3 major bleeds caused
益处和危害相互抵消。
该患者是否应该服用阿司匹林进行一级预防?
(benefit > harm)
(benefit ≈ harm)
(harm > benefit)
他们无法表明高风险患者获得
while low-risk patients lost.
Only by examining each patient's baseline risk,
each patient's outcomes,
could the interaction be revealed.
他在网上读到“阿司匹林可以预防心脏病”并希望得到您的建议。
可能会伤害安全。
在开处方前了解患者的风险。
这是个人数据告诉我们的。“
95,000 patients. One truth: risk determines benefit.
“老人应该和年轻人一样对待吗?”
Some said: "Lower is always better."
还有人说:“老人很脆弱,要小心。”
谁是对的?
激进阵营: "Every mmHg of BP reduction saves lives.
Treat everyone to target 120/80."
保守阵营: "In the elderly, low BP causes falls, strokes, death.
There's a J-curve—too low is dangerous."
Individual trials were too small to settle it. 直到BPLTTC 收集了个人数据。
Is there a J-curve at very low pressures?
Do the very old (>80 years) still benefit?
Age 55-64: 10% lower major CV events
Age 65-74: 10% lower major CV events
Age 75-84: 10% lower major CV events
Age ≥85: Still 10% lower
pinteraction = 0.85。没有年龄修改的证据。
who achieved very low blood pressures.
Result: No J-curve in randomized comparisons.
观察数据中的表观 J 曲线 reverse causation— sick patients have low BP because they're sick, not sick because their BP is low.
只有随机对照试验中的 IPD 才能解决这个问题。
Should This Elderly Patient Get BP Treatment?
Proportional benefit maintained across all ages
Consider frailty, life expectancy, patient preference—but not age.
你记得 BPLTTC IPD 荟萃分析。
但个人数据显示并非如此:
At every age, the benefit endures.
Do not let age alone deny protection."
when a clot blocks the brain?
Every minute, two million neurons die.
But when does the window close?
When is it too late to intervene?
But given too late, it causes bleeding into dying brain tissue.
What is the time window? 3 hours? 4.5 hours? 6 hours?
Individual trials disagreed. Guidelines were uncertain.
They knew exactly when each patient's stroke began. They knew exactly when thrombolysis was given. They knew exactly who lived, who died, who recovered.
They could map benefit against time, minute by minute.
Treated at 90 min: 1 in 4 achieve excellent outcome
Treated at 180 min: 1 in 7 achieve excellent outcome
Treated at 270 min: 1 in 14 achieve excellent outcome
Acute Ischemic Stroke: Give Thrombolysis?
URGENTLY
if eligible
imaging
avoid
Trial B: "Patients treated within 6 hours" (average: 4.2 hours)
These overlapping, inconsistent windows couldn't be compared.
Only by knowing each patient's exact time
could the continuous decay of benefit be mapped.
Door-to-needle time will add 30 minutes, making total time ~4.5 hours.
two million neurons perish.
IPD 向我们展示了逐渐消失的窗口。
Act quickly, or the window closes forever."
您打开充满希望的文件。
然后您会看到它: empty cells.
Age: 67. Sex: Male. Smoking status: missing.
Outcome at 1 year: missing.
What now?
Missing Completely at Random (MCAR)
Lab machine broke randomly. No relation to patient characteristics. Safe to ignore (but wasteful).
Missing at Random (MAR)
Older patients more likely to miss follow-up. Missingness related to observed variables. Imputation can help.
非随机缺失(MNAR)
结果不佳的患者退出。与 missing value itself. 危险相关的失踪。需要敏感性分析。
如何处理缺失值?
may suffice
imputation
sensitivity
Multiple imputation (M=20-50 datasets) reflects uncertainty 关于缺失值可能是什么。
这保留了有效的标准误差和 p 值。
Systematically Missing Variables
Trial A measured biomarker X. Trial B didn't. Can't impute what was never collected.
Multilevel Structure
Patients nested within trials. Imputation model must account for clustering.
Different Follow-up Durations
试验 A 持续了 2 年。试验B为期5年。生存分析需要小心。
Trial Didn't Collect Your Key Covariate
analysis to trials
使用变量
varying assumptions
about unmeasured
这 3 项试验往往时间较长且规模较小。
It is a question: 为什么这是未知的?
在填写之前回答该问题差距——
因为缺席的原因决定了解决方案。”
谁吗?从 willing trialists
收集数据并宣布胜利?
But the unwilling held secrets.
这些秘密改变了一切。
Cochrane reviewers requested trial data to verify efficacy. Roche refused, citing confidentiality.
五年来,英国医学杂志 (BMJ) 一直致力于提高透明度。当完整的临床研究报告最终于 2014 年发布时,情况发生了变化:达菲将症状持续时间缩短了不到一天,并且不能预防并发症。
在这种药物上花费了数十亿美元,但其完整证据却被锁住了。
达菲传奇改变了人们的期望——如今,临床试验透明度正在成为常态,而不是例外。
Industry-sponsored trials: Less likely to share
负面结果的试验: Less likely to share
Older trials: 数据经常丢失
If these trials systematically differ in effect size,
您的 IPD-MA 存在偏差。
Assessing Availability Bias
for bias
敏感性分析需要
Report IPD retrieval rate
“我们从 12/15 次试验中获得了 IPD (80%)”
Compare IPD vs. non-IPD trial characteristics
样本大小、资金、发表日期、汇总效果大小数据
Sensitivity analysis including non-IPD trials
结合非共享试验的IPD + AD的两阶段分析
讨论非共享的原因
Data lost? Refused? Never requested? Each has different implications.
您的 IPD 分析显示 OR 0.70。
在您拥有的地方使用 IPD(用于交互测试)。
用 AD 进行补充以进行总体效果估计。
关于来源的透明度。
敞开的门可能会隐藏真相。
Always ask: Who refused to share?
And what might they be hiding?"
本课程引用的主要来源
- Riley RD, et al. 个人参与者数据元分析:医疗保健研究手册。 Wiley, 2021.
- Stewart LA 等al. PRISMA-IPD:对个体参与者数据进行系统评价和荟萃分析的首选报告项目。 JAMA 2015;313:1657-65.
- Early Breast Cancer Trialists' Collaborative Group. Tamoxifen for early breast cancer. Cochrane Database Syst Rev 2001.
- Cholesterol Treatment Trialists' Collaboration. Efficacy and safety of LDL-lowering therapy. Lancet 2010;376:1670-81.
- Roberts D, et al. Antenatal corticosteroids for accelerating fetal lung maturation. Cochrane Database Syst Rev 2017.
- IMPACT Study Group. Predicting outcome after traumatic brain injury. PLoS Med 2008;5:e165.
- Debray TPA, et al. Get real in individual participant data meta-analysis. Int J Epidemiol 2015;44:1287-97.
- Burke DL, et al. Meta-analysis using individual participant data. Stat Med 2017;36:320-38.
But you have learned to find them.
You have learned to ask: Who benefits? Who is harmed?
现在开始——不要让任何患者消失。”
隐藏的病人 — 现在你看到他们了。