テストに嘘があるとき: 診断テストの精度に関するコース (強化版)

==================== モジュール 1: 詐欺 ====================

その女性の話を聞いたことがありませんか
who promised to 一滴の血で世界を変える,
who raised billions on a test that never worked?

Palo Alto, 2003

STANFORD UNIVERSITY

19 歳の少年は、一滴の滴から何百もの血液検査を受けるというビジョンを抱いて中退しました。

Investors believed. Walgreens believed. The Pentagon believed.

They gave her $9 billion.

しかし、テストでは間違った結果が得られました。患者は、HIVに感染していないにもかかわらず、HIVに感染していると告げられました。患者は、自分の血液は正常だったと言われました。 dying.

Carreyrou J. Bad Blood. 2018

欺瞞の決定木

What Theranos Did vs. What Should Happen

New Diagnostic Test

↓

SHOULD DO

Validate Against Gold Standard

↓

Publish TP/FP/FN/TN

↓

FDA Approval

THERANOS DID

Skip Validation

↓

Hide Failures

↓

Harm Patients

「そしてテストは嘘をつきました、
そしてその嘘は確実性を帯びていた、
そして誰も2x2テーブルを要求しませんでした。」

これが、私たちが診断テストの精度を研究する理由です。

==================== モジュール 2: 4 つの結果 ====================

When a test speaks,
あるだけです four possible truths.

二つは祝福です。 2つは呪いです。

結果のツリー

Every Test Result Has a Reality Behind It

Patient Tested

↓

真実とは何でしょうか？

Has Disease

D+

↓

TPTest +

FNTest -

No Disease

D-

↓

FPTest +

TNTest -

神聖な2x2テーブル

HIV Rapid Test Example (Real Data)

	HIV+	HIV-	Total
Test +	98	3	101
Test -	2	895	897
Total	100	898	998

この表からすべての真実が得られます

Sensitivity = 98/100 = 98%
Specificity = 895/898 = 99.7%

"Two outcomes save. Two outcomes harm.
TP、TN: テストは真実でした。
FP、FN: 検査は嘘だった。
Know them by name, for they determine fate."

==================== モジュール 3: HIV ウィンドウ期間 ====================

血液検査のことを聞いたことがありませんか?
found clean,
そして何千人もの人に与えられました—
while death swam within it?

血液供給危機、1985 年

UNITED STATES

When HIV testing began, doctors celebrated: they could now screen the blood supply.

しかし、テストには window period—ウイルスが存在していた感染後数週間ですが、 undetectable.

血液検査が行われました。血液検査は「陰性」だった。輸血された。

8,000-12,000 Americans より良い検査が可能になる前に、輸血によって感染したのです。

CDC. MMWR. 1987;36(49):833-840

The Window Period Decision Tree

Why False Negatives Are Deadly

Person Recently Infected

↓

Time Since Infection?

< 2 weeks

Test NEGATIVEVirus present!

↓

Blood DonatedOthers infected

> 4 weeks

Test POSITIVECorrectly detected

↓

Blood DiscardedSupply safe

時間の経過とともに感度が変化する

Day 1-7
Eclipse period

~50%

Day 14
Seroconversion

~95%

Day 21
Most detected

99.9%

Day 45+
Window closed

THE LESSON

感度は固定ではありません。 It depends on when you test. A "99% sensitive" test may be 0% sensitive in early infection.

「そしてテストでは『クリーン』と出ました。
なぜなら、ウイルスはまだその姿を現していなかったからだ。
そして血は分かち合った、
そして感染は罪のない人々に広がった。」

==================== モジュール 4: 感度と特異度 ====================

A test has two virtues and two vices.

Sensitivity: 病気の人を見つけることはできますか?

Specificity：健康な人を救うことができるでしょうか？

感受性: ハンター

THE FORMULA

Sensitivity = TP / (TP + FN)

"Of all the sick, how many did we catch?"

Worked Example: COVID PCR Test

Given: 200 infected patients tested

TP = 196 (correctly positive), FN = 4 (missed)

Sensitivity = 196 / (196 + 4) = 196/200 = 98%

Interpretation: Test catches 98 of every 100 infected people

特異性: ガーディアン

THE FORMULA

Specificity = TN / (TN + FP)

"Of all the healthy, how many did we spare?"

Worked Example: Same COVID PCR Test

Given: 1000 uninfected people tested

TN = 999 (correctly negative), FP = 1 (false alarm)

Specificity = 999 / (999 + 1) = 999/1000 = 99.9%

Interpretation: Test correctly clears 999 of every 1000 healthy people

記憶のルール

When to Use Which Test

あなたは何が必要ですか？

RULE OUT disease

Use HIGH SENSITIVITY

↓

SnNoutSensitive Negative = OUT

RULE IN disease

Use HIGH SPECIFICITY

↓

SpPinSpecific Positive = IN

「敏感さが病人を捕まえる。
特異性があれば問題はありません。
But no test masters both perfectly—
これが我々が負う重荷だ。」

==================== モジュール 5: 基本レートの誤り ====================

医師の診察を受けなかったのですか
who saw 99% accurate
and believed a positive result meant 99% certainty?

これは医療における最も致命的な間違いです。

基本料金の誤謬

THE PUZZLE

A disease affects 1 in 1000 people.
検査の感度は 99%、特異度は 99% です。
A patient tests positive.

彼らが病気に罹患している確率はどれくらいでしょうか?

Most doctors say ~99%. 本当の答えは約9％です。

明らかになった数学

Testing 100,000 People (Prevalence 1/1000)

Step 1: 100 have disease, 99,900 healthy

Step 2: Of 100 sick: 99 test positive (TP), 1 negative (FN)

Step 3: Of 99,900 healthy: 999 test positive (FP), 98,901 negative (TN)

Step 4: Total positives = 99 + 999 = 1,098

PPV = TP / All Positives = 99 / 1,098 = 9%

陽性結果の 91% は偽陽性です。

有病率の決定木

Same Test, Different Settings

Test: 99% Sens, 99% Spec

↓

Where Is Testing Done?

General Population
Prevalence 0.1%

PPV = 9%91% false positives!

High-Risk Clinic
Prevalence 10%

PPV = 92%8% false positives

Confirmatory Test
Prevalence 50%

PPV = 99%1% false positives

「そして医師は『99％正確です』と言いました。」
すると患者は「99％確実だ」と聞きました。
そして二人とも騙された――
なぜなら彼らは、「この病気はどれくらい珍しいのですか？」と尋ねるのを忘れていたからです。」

==================== モジュール 6: PSA 論争 ====================

男性向けのテストについて聞いたことはありませんか
癌が発見された never kill,
そして、それが治療法につながりました。 destroyed lives?

PSAスクリーニングの悲劇

UNITED STATES, 1990s-2010s

PSA (Prostate-Specific Antigen) could detect prostate cancer early.

医師たちは何百万人もの男性を検査しました。がんが見つかった。前立腺を切除した。

しかし、これらの「がん」の多くは症状を引き起こすことはありませんでした。手術が原因で インポテンスと失禁 in men who would have died of old age, not cancer.

Moyer VA. Ann Intern Med. 2012;157:120-134

害の数

～から救われた命
prostate cancer
per 1000 screened

30-40

Men made impotent
or incontinent
per 1000 screened

100+

False positives
(biopsies, anxiety)
per 1000 screened

THE REVERSAL

In 2012, the US Preventive Services Task Force recommended against 定期的なPSA検査。テストでは、見つける必要のないものが多すぎました。

スクリーニング決定ツリー

スクリーニングの予期せぬ結果

1000 Men Screened

↓

~120 Positive PSA

↓

~30 Biopsies Show Cancer

↓

~25 Would Never
Have Harmed

~5 Truly
Aggressive

~880 Negative PSA

↓

Reassured(But ~3 have aggressive cancer missed)

「そしてテストで影が見つかった、
そして外科医が切った、
そして男は生きていた――無力で失禁していた――
決して目覚めることのなかった癌からです。」

==================== モジュール 7: 尤度比 ====================

感度はテストを表します。
特異性はテストを説明します。

しかし、患者はこう尋ねます。
"I tested positive. What are MY chances?"

Likelihood Ratios

POSITIVE LIKELIHOOD RATIO

LR+ = Sensitivity / (1 - Specificity)

How much more likely is a + result in sick vs healthy?

NEGATIVE LIKELIHOOD RATIO

LR- = (1 - Sensitivity) / Specificity

How much more likely is a - result in sick vs healthy?

フェイガンのノモグラム

テスト前からテスト後の確率まで

Pre-Test
Probability

99%

50%

20%

Likelihood
Ratio

100

0.1

0.01

Post-Test
Probability

99%

80%

50%

20%

Draw a line from pre-test through LR to find post-test probability

Interpreting Likelihood Ratios

このテストはどれほど強力ですか?

What Is the LR+?

LR+ > 10Strong rule-in

LR+ 5-10Moderate

LR+ 2-5Weak

LR+ 1-2Useless

What Is the LR-?

LR- < 0.1Strong rule-out

LR- 0.1-0.2Moderate

LR- 0.2-0.5Weak

LR- 0.5-1Useless

「感受性は病人について語る。
特異性は井戸について語ります。
But the likelihood ratio answers:
この結果はこの患者にとって何を意味するのでしょうか?"

==================== モジュール 8: マラリア RDT ====================

村で熱を出している子供を見かけませんでしたか？
と言う迅速検査 negative,
and the Plasmodium それは増え続けましたか？

マラリアRDT問題

SUB-SAHARAN AFRICA

Malaria kills 600,000 people yearly, mostly children under 5.

Rapid Diagnostic Tests were meant to guide treatment in remote areas without microscopes or laboratories.

But when parasitemia is low—RDT はケースを見逃します. And when P. falciparum HRP2遺伝子を削除します— the RDT sees nothing at all.

WHO. Malaria RDT Performance. 2022

臨床意思決定ツリー

Child with Fever in Malaria-Endemic Area

Febrile Child

↓

Perform RDT

↓

RDT Positive

↓

マラリアの治療

RDT Negative

↓

Clinical Suspicion?

High

Treat Anyway
or Microscopy

Low

Look for
Other Cause

Sensitivity Varies by Parasitemia

95%

High parasitemia
(>200/μL)

75%

Low parasitemia
(100-200/μL)

50%

Very low
(<100/μL)

臨床レッスン

A negative RDT does not rule out malaria in endemic areas. Clinical judgment must override the test when suspicion is high.

「そして検査結果は『陰性』でした」
そして子供は家に帰されました、
そして寄生虫は暗闇の中で増殖し、
そして朝までに子供は目を覚ますことができませんでした。"

====================モジュール 9: 新型コロナウイルスの迅速テスト ====================

疫病の年に、
世界は fast.

というテストを必要としていましたが、速いということは accurate.

と同じではありません。コクラン評決

COVID-19 Rapid Antigen Tests (155 Studies Pooled)

Population	Sensitivity	Missed Cases
Symptomatic	73%	27% missed
Asymptomatic	55%	45% missed
First 7 days of symptoms	80%	20% missed

Dinnes J et al. Cochrane Database Syst Rev. 2022;7:CD013705

The False Security Decision Tree

Thanksgiving 2020: What Happened

Family Member Tests Negative

↓

この人は本当に陰性ですか?

55% chance if asymptomatic

True NegativeSafe to gather

45% chance if asymptomatic

FALSE NegativeInfectious!

↓

家族と集まりGrandparents infected

「そして検査結果は『陰性』でした」
家族は抱き合い、
そして冬の終わりまでに
祖父は「

癌を発見した検査
のことを聞いたことがありますか? would never kill,
そして、それが治療法につながりました。 caused more harm than the disease?

過剰診断問題

3-4

Lives saved
per 10,000 screened

~15

Overdiagnosed
(treated unnecessarily)

~500

False alarms
(anxiety, biopsies)

THE QUESTION

3～4人の命を救うために、およそ15人の女性が、本来なら害を及ぼすことのなかった癌に対して手術、放射線治療、化学療法を受けています。

このトレードオフは価値がありますか?

スクリーニング決定ツリー

10年間にわたり1万人の女性を検査

10,000 Women

↓

~1,000 RecalledAbnormal mammogram

↓

~500 False AlarmAnxiety only

~500 Biopsy~50 cancer found

~9,000 ClearedContinue screening

Of ~50 Cancers Found

~35 Would KillTreatment saves 3-4

~15 Would Never KillOverdiagnosed

「そしてテストで影が見つかった、
し、それをガンと呼びました
そして女性は切られ火傷を負いました—
彼女の日々を決して暗くすることのなかった影のために。」

ある研究では、欺瞞します。
1 つの研究はお世辞かもしれません。

でも集まると すべての証拠—
the truth becomes harder to hide.

Why DTA Meta-Analysis Is Different

THE PROBLEM

感度と特異度は correlated. When one goes up, the other tends to go down.

治療効果のように別々にプールすることはできません。必要なのは bivariate model.

SROC 曲線

Reading ROC Space

Top-Left CornerPerfect Test

↓ (curve shows trade-off)

Diagonal LineUseless Test (Chance)

SROC が示すもの

Each dot = one study's sensitivity & specificity
曲線 = すべての研究の概要
Closer to top-left = better test

「1 つの研究では騙される可能性があります。
多くの研究を比較検討一緒に、
真実の道筋、
テストで実際に何ができるかを明らかにする SROC 曲線を追跡します。"

しかし、もし研究が disagree?

One says sensitivity is 95%.
Another says 60%.

あなたはどの真実を信じますか？

Sources of Heterogeneity

なぜ研究結果が一致しないのか

同じテストでも結果は異なりますか?

ThresholdDifferent cutoffs

PopulationSeverity, age

SettingPrimary vs specialist

QualityBias, blinding

Measuring Disagreement: I²

I² < 25%

Low
Studies agree

I² 25-75%

Moderate
Some variation

I² > 75%

High
Major disagreement

THE WARNING

When I² > 75%, the pooled estimate may be meaningless. Explain the disagreement before averaging.

「研究結果が一致しないときは、
反対意見を黙らせないでください。
Ask: Why do they see differently?
意見の相違自体が教えてくれます。」

=================== モジュール 13:ツールキット ====================

DTA ツールキット

重要な対策とそれをいつ使用するか

The Checklist

✓

Was there a valid reference standard?

Gold standard applied to ALL patients?

✓

通訳者は盲目だったのでしょうか？

Test readers unaware of diagnosis?

✓

スペクトルは適切でしたか?

母集団と類似した患者?

✓

しきい値は事前に指定されていましたか?

それとも結果を最大化するために選択されましたか?

When Results Don't Match Suspicion

The Clinical Override Decision Tree

Test Negative, High Suspicion

↓

What Is the LR-?

LR- < 0.1

Strong rule-outAccept negative

LR- 0.1-0.5

Consider repeat testOr different test

LR- > 0.5

Trust clinical judgmentTest is weak

"Armed with sensitivity, specificity, likelihood,
SROCと合意手段で武装し、
テストの嘘を見破ることができる――
そしてその真実を自分で判断してください。」

=================== モジュール 14: クイズと参考 ====================

References

Key Sources

Carreyrou J. Bad Blood. Knopf, 2018.
CDC. MMWR. 1987;36(49):833-840. [HIV blood supply]
Dinnes J et al. Cochrane Database Syst Rev. 2022;7:CD013705. [COVID RAT]
Moyer VA. Ann Intern Med. 2012;157:120-134. [PSA screening]
UK Panel. Lancet. 2012;380:1778-1786. [Mammography]
WHO. Malaria RDT Performance. 2022.
Reitsma JB et al. J Clin Epidemiol. 2005;58:982-990. [Bivariate model]
Deeks JJ et al. J Clin Epidemiol. 2005;58:882-893. [Publication bias]
Macaskill P et al. Cochrane Handbook Ch. 10. 2023.

テストは 99% の感度と 99% の特異性を持っています。病気の有病率は1/1000です。患者が検査で陽性反応を示した。彼らが病気に罹患している確率はどれくらいでしょうか?

99%

90%

About 9%

50%

検査にもかかわらず血液供給がHIVに汚染されたのはなぜですか?

The tests had low specificity

検査には感染初期に感度が低いウインドウピリオドがあった

検査は正しく実施されませんでした

検査は費用が高すぎました

What does "SnNout" mean?

A highly Sensitive test, when Negative, rules OUT disease

A highly Specific test, when Negative, rules OUT disease

Sensitivity should be used for screening

Specificity should be above 90%

✔

Course Complete

「これで 4 つの結果がわかりました。
テストの 2 つの美徳
根拠の誤りレート、
そして証拠を集める技術。

次の試練があなたに課せられたとき—
分かるでしょう。」

1 / 4