====================== 第 1 单元:欺诈 ======================
你没听过这个故事吗?女人
who promised to 用一滴血改变世界,
who raised billions on a test that never worked?
Palo Alto, 2003
STANFORD UNIVERSITY
一名十九岁的女孩怀着一个愿景辍学:用一滴血进行数百次血液测试。

Investors believed. Walgreens believed. The Pentagon believed.

They gave her $9 billion.

但测试给出了错误的结果。患者被告知他们感染了艾滋病毒,但实际上他们并没有感染。当患者 dying.
Carreyrou J. Bad Blood. 2018
欺骗决策树

What Theranos Did vs. What Should Happen

New Diagnostic Test
SHOULD DO
Validate Against Gold Standard
Publish TP/FP/FN/TN
FDA Approval
THERANOS DID
Skip Validation
Hide Failures
Harm Patients
时,患者被告知他们的血液是正常的,并且测试撒了谎,
并且谎言是确定无疑的,
并且没有人要求2x2表。”

这就是我们研究诊断测试准确性的原因。

======================模块 2:四个结果====================
When a test speaks,
只有 four possible truths.

两个是祝福。其中两个是诅咒。
结果树

Every Test Result Has a Reality Behind It

Patient Tested
真相是什么?
Has Disease
D+
TPTest +
FNTest -
No Disease
D-
FPTest +
TNTest -
神圣的2x2表

HIV Rapid Test Example (Real Data)

HIV+HIV-Total
Test +983101
Test -2895897
Total100898998
从此表中得出所有真相
Sensitivity = 98/100 = 98%
Specificity = 895/898 = 99.7%
"Two outcomes save. Two outcomes harm.
TP, TN:测试说的是真。
FP、FN:测试说谎了。
Know them by name, for they determine fate."
====================== 模块 3:HIV 窗口期 ====================
你没听说过那条血吗?进行了测试,
found clean,
并给予数千人——
while death swam within it?
血液供应危机,1985年
UNITED STATES
When HIV testing began, doctors celebrated: they could now screen the blood supply.

但是测试发生了 window period——感染后几周,病毒存在,但对 undetectable.

血液进行了测试。血液呈“阴性”。输血了。

8,000-12,000 Americans 在更好的测试关闭窗口之前通过输血被感染。
CDC. MMWR. 1987;36(49):833-840
The Window Period Decision Tree

Why False Negatives Are Deadly

Person Recently Infected
Time Since Infection?
< 2 weeks
Test NEGATIVEVirus present!
Blood DonatedOthers infected
> 4 weeks
Test POSITIVECorrectly detected
Blood DiscardedSupply safe
敏感性随时间变化
0%
Day 1-7
Eclipse period
~50%
Day 14
Seroconversion
~95%
Day 21
Most detected
99.9%
Day 45+
Window closed
THE LESSON
敏感性不固定。 It depends on when you test. A "99% sensitive" test may be 0% sensitive in early infection.
”测试说“干净”,
因为病毒还没有露面。
血液被共享,
感染传播到了无辜者。”
==================== 模块 4:灵敏度和特异性====================
A test has two virtues and two vices.

Sensitivity:它能找到病人吗?

Specificity:它能保护健康人吗?
灵敏度:猎人
THE FORMULA
Sensitivity = TP / (TP + FN)
"Of all the sick, how many did we catch?"

Worked Example: COVID PCR Test

Given: 200 infected patients tested
TP = 196 (correctly positive), FN = 4 (missed)
Sensitivity = 196 / (196 + 4) = 196/200 = 98%
Interpretation: Test catches 98 of every 100 infected people
特异性:守护者
THE FORMULA
Specificity = TN / (TN + FP)
"Of all the healthy, how many did we spare?"

Worked Example: Same COVID PCR Test

Given: 1000 uninfected people tested
TN = 999 (correctly negative), FP = 1 (false alarm)
Specificity = 999 / (999 + 1) = 999/1000 = 99.9%
Interpretation: Test correctly clears 999 of every 1000 healthy people
记忆法则

When to Use Which Test

你需要什么?
RULE OUT disease
Use HIGH SENSITIVITY
SnNoutSensitive Negative = OUT
RULE IN disease
Use HIGH SPECIFICITY
SpPinSpecific Positive = IN
“敏感会传染疾病。
特异性可以避免井井有条。
But no test masters both perfectly—
这是我们所承受的负担。”
==================== 模块 5:基本利率谬误 ====================
你没见过医生吗
who saw 99% accurate
and believed a positive result meant 99% certainty?

这是医学上最致命的错误。
基本利率谬误
THE PUZZLE
A disease affects 1 in 1000 people.
测试的敏感性为 99%,特异性为 99%。
A patient tests positive.

他们患有这种疾病的概率是多少?

Most doctors say ~99%. 真正的答案大约是9%。
数学揭晓

Testing 100,000 People (Prevalence 1/1000)

Step 1: 100 have disease, 99,900 healthy
Step 2: Of 100 sick: 99 test positive (TP), 1 negative (FN)
Step 3: Of 99,900 healthy: 999 test positive (FP), 98,901 negative (TN)
Step 4: Total positives = 99 + 999 = 1,098
PPV = TP / All Positives = 99 / 1,098 = 9%
91% 的阳性结果是假阳性!
流行率决策树

Same Test, Different Settings

Test: 99% Sens, 99% Spec
Where Is Testing Done?
General Population
Prevalence 0.1%
PPV = 9%91% false positives!
High-Risk Clinic
Prevalence 10%
PPV = 92%8% false positives
Confirmatory Test
Prevalence 50%
PPV = 99%1% false positives
“医生说‘99%准确’,
病人听到“99%确定”
两人都被骗了——
因为他们忘了问:这种疾病有多罕见?”
==================== 模块 6:PSA 争议 ====================
你没听说过男性测试吗
发现了癌症 never kill,
并导致治疗 destroyed lives?
PSA 筛查悲剧
UNITED STATES, 1990s-2010s
PSA (Prostate-Specific Antigen) could detect prostate cancer early.

医生对数百万男性进行了筛查。发现了癌症。前列腺被切除。

但其中许多“癌症”永远不会引起症状。手术造成 阳痿和失禁 in men who would have died of old age, not cancer.
Moyer VA. Ann Intern Med. 2012;157:120-134
伤害的数字
1
生命被拯救
prostate cancer
per 1000 screened
30-40
Men made impotent
or incontinent
per 1000 screened
100+
False positives
(biopsies, anxiety)
per 1000 screened
THE REVERSAL
In 2012, the US Preventive Services Task Force recommended against 常规 PSA 筛查。测试发现了太多不需要发现的东西。
筛选决策树

筛查的意外后果

1000 Men Screened
~120 Positive PSA
~30 Biopsies Show Cancer
~25 Would Never
Have Harmed
~5 Truly
Aggressive
~880 Negative PSA
Reassured(But ~3 have aggressive cancer missed)
“测试发现了影子,
然后外科医生切开,
那个人还活着——无能、大小便失禁——
患有永远不会醒来的癌症。”
==================== 模块 7:似然比 ====================
灵敏度描述了测试。
特异性描述了测试。

但病人问:
"I tested positive. What are MY chances?"
Likelihood Ratios
POSITIVE LIKELIHOOD RATIO
LR+ = Sensitivity / (1 - Specificity)
How much more likely is a + result in sick vs healthy?
NEGATIVE LIKELIHOOD RATIO
LR- = (1 - Sensitivity) / Specificity
How much more likely is a - result in sick vs healthy?
费根列线图

从测试前到测试后的概率

Pre-Test
Probability
99%
50%
20%
5%
1%
Likelihood
Ratio
100
10
1
0.1
0.01
Post-Test
Probability
99%
80%
50%
20%
1%
Draw a line from pre-test through LR to find post-test probability
Interpreting Likelihood Ratios

这个测试有多强大?

What Is the LR+?
LR+ > 10Strong rule-in
LR+ 5-10Moderate
LR+ 2-5Weak
LR+ 1-2Useless
What Is the LR-?
LR- < 0.1Strong rule-out
LR- 0.1-0.2Moderate
LR- 0.2-0.5Weak
LR- 0.5-1Useless
“灵敏度告诉我们有病。
特异性告诉我们健康。
But the likelihood ratio answers:
什么这个结果对这位患者意味着什么吗?"
==================== 第 8 单元:疟疾 RDT ======================
您没见过村里发烧的孩子吗,
快速检测说 negative,
and the Plasmodium 不断繁殖?
疟疾RDT问题
SUB-SAHARAN AFRICA
Malaria kills 600,000 people yearly, mostly children under 5.

Rapid Diagnostic Tests were meant to guide treatment in remote areas without microscopes or laboratories.

But when parasitemia is low—RDT漏掉病例. And when P. falciparum 删除HRP2基因— the RDT sees nothing at all.
WHO. Malaria RDT Performance. 2022
临床决策树

Child with Fever in Malaria-Endemic Area

Febrile Child
Perform RDT
RDT Positive
治疗疟疾
RDT Negative
Clinical Suspicion?
High
Treat Anyway
or Microscopy
Low
Look for
Other Cause
Sensitivity Varies by Parasitemia
95%
High parasitemia
(>200/μL)
75%
Low parasitemia
(100-200/μL)
50%
Very low
(<100/μL)
临床教训
A negative RDT does not rule out malaria in endemic areas. Clinical judgment must override the test when suspicion is high.
“测试结果显示‘阴性’,
孩子被送回家,
寄生虫在体内繁殖。天黑了,
到了早上,孩子就醒不过来了。”
==================== 第 9 单元:新冠病毒快速测试 ====================
在瘟疫肆虐的那一年,
世界需要一个测试 fast.

但快速与 accurate.
Cochrane 判决

COVID-19 Rapid Antigen Tests (155 Studies Pooled)

PopulationSensitivityMissed Cases
Symptomatic73%27% missed
Asymptomatic55%45% missed
First 7 days of symptoms80%20% missed

Dinnes J et al. Cochrane Database Syst Rev. 2022;7:CD013705

The False Security Decision Tree

Thanksgiving 2020: What Happened

Family Member Tests Negative
这个人真的呈阴性吗?
55% chance if asymptomatic
True NegativeSafe to gather
45% chance if asymptomatic
FALSE NegativeInfectious!
与家人聚集Grandparents infected
“测试结果显示‘阴性’,
和家人拥抱,
到冬天结束时,
祖父被埋葬了。”
你有没有听说过筛查
发现癌症 would never kill,
并导致治疗 caused more harm than the disease?
过度诊断问题
3-4
Lives saved
per 10,000 screened
~15
Overdiagnosed
(treated unnecessarily)
~500
False alarms
(anxiety, biopsies)
THE QUESTION
为了挽救 3-4 条生命,约 15 名女性接受了永远不会伤害她们的癌症手术、放疗和化疗。

这种权衡值得吗?
筛选决策树

10,000 名女性经过 10 年的筛查

10,000 Women
~1,000 RecalledAbnormal mammogram
~500 False AlarmAnxiety only
~500 Biopsy~50 cancer found
~9,000 ClearedContinue screening
Of ~50 Cancers Found
~35 Would KillTreatment saves 3-4
~15 Would Never KillOverdiagnosed
“测试发现了影子,
并将其称为癌症,
而这位女士被割伤并被烧伤——
为了一个永远不会让她的日子变得黑暗的阴影。”
一项研究可能会欺骗。
一项研究可能会让人满意。

但是当您收集 所有证据
the truth becomes harder to hide.
Why DTA Meta-Analysis Is Different
THE PROBLEM
敏感性和特异性是 correlated. When one goes up, the other tends to go down.

您不能像治疗效果那样将它们分开汇总。您需要 bivariate model.
SROC曲线

Reading ROC Space

Top-Left CornerPerfect Test
↓ (curve shows trade-off)
Diagonal LineUseless Test (Chance)
SROC显示什么
Each dot = one study's sensitivity & specificity
曲线=所有研究的总结
Closer to top-left = better test
“一项研究可能会欺骗。
许多研究,权衡一起
追踪真理之路——
揭示测试真正作用的SROC曲线。”
但是如果研究 disagree?

One says sensitivity is 95%.
Another says 60%.

你相信哪个真理?
Sources of Heterogeneity

为什么研究不同意

相同的测试,不同的结果?
ThresholdDifferent cutoffs
PopulationSeverity, age
SettingPrimary vs specialist
QualityBias, blinding
Measuring Disagreement: I²
I² < 25%
Low
Studies agree
I² 25-75%
Moderate
Some variation
I² > 75%
High
Major disagreement
THE WARNING
When I² > 75%, the pooled estimate may be meaningless. Explain the disagreement before averaging.
“当研究存在分歧时,
不要压制异议。
Ask: Why do they see differently?
分歧本身就说明了一切。”
====================第 13 单元:工具包 ====================
您的 DTA 工具包
基本措施以及何时使用它们
The Checklist

Was there a valid reference standard?

Gold standard applied to ALL patients?

口译员是否被蒙蔽了?

Test readers unaware of diagnosis?

频谱是合适吗?

与您的人群相似的患者?

阈值是否预先指定?

还是选择最大化结果?

When Results Don't Match Suspicion

The Clinical Override Decision Tree

Test Negative, High Suspicion
What Is the LR-?
LR- < 0.1
Strong rule-outAccept negative
LR- 0.1-0.5
Consider repeat testOr different test
LR- > 0.5
Trust clinical judgmentTest is weak
"Armed with sensitivity, specificity, likelihood,
配备了 SROC 和一致性度量,
您可以通过测试的谎言 -
并自行判断其真实性。”
====================== 第 14 单元:测验和参考 ====================
References

Key Sources

  1. Carreyrou J. Bad Blood. Knopf, 2018.
  2. CDC. MMWR. 1987;36(49):833-840. [HIV blood supply]
  3. Dinnes J et al. Cochrane Database Syst Rev. 2022;7:CD013705. [COVID RAT]
  4. Moyer VA. Ann Intern Med. 2012;157:120-134. [PSA screening]
  5. UK Panel. Lancet. 2012;380:1778-1786. [Mammography]
  6. WHO. Malaria RDT Performance. 2022.
  7. Reitsma JB et al. J Clin Epidemiol. 2005;58:982-990. [Bivariate model]
  8. Deeks JJ et al. J Clin Epidemiol. 2005;58:882-893. [Publication bias]
  9. Macaskill P et al. Cochrane Handbook Ch. 10. 2023.
测试的敏感性为 99%,特异性为 99%。 1/1000。患者感染该疾病的概率是多少?
99%
90%
About 9%
50%
为什么尽管进行了检测,血液供应仍被 HIV 污染?
The tests had low specificity
检测在早期感染中有一个敏感性较低的窗口期
检测未正确执行
检测太差了。昂贵的
What does "SnNout" mean?
A highly Sensitive test, when Negative, rules OUT disease
A highly Specific test, when Negative, rules OUT disease
Sensitivity should be used for screening
Specificity should be above 90%
Course Complete
“现在你知道了四种结果,
测试的两个优点,
基本比率的谬误,
以及汇集证据的艺术。

当下一个测试对你不利时——
你就会知道。”