Retraction Fragility Quantifier — How robust is your meta-analysis to study retractions?
0 for log scales, 0 for SMD
Max # studies to retract combinatorially
| Retracted Study | Pooled Est. | Change | % Change | Reversal? | New P |
|---|
Bars show the change in pooled estimate when each study is removed. Sorted by absolute impact.
Studies removed in order of most-to-least impactful. Shows how quickly significance erodes.
| Study | Leverage (hi) | Stud. Resid (ri*) | Cook's D | DFBETAS | Flags |
|---|
X = contribution to Q (heterogeneity), Y = influence on pooled estimate. Upper-right = influential + heterogeneous.
Distribution of pooled estimates across random study subsets. Bimodality suggests important subgroups.
X = precision (1/SE), Y = Z-score (effect/SE). The regression line through the origin has slope = pooled estimate. Studies outside the 95% band (±1.96) are heterogeneity-driving outliers. (Galbraith 1988)
| RFI = 1 | Extremely fragile — removing any single study reverses the conclusion. |
| RFI = 2–3 | Fragile — a small cluster of retractions can change the conclusion. |
| RFI ≥ 4 | Moderately robust — multiple retractions needed. |
| RFI = k | Maximally robust — even removing all but 2 studies doesn't change conclusion. |
1. Compute the full pooled estimate and its P-value.
2. For each subset of 1, 2, ..., d studies (up to max depth), re-pool without those studies.
3. The RFI is the smallest number of retractions that reverses significance (crosses P = 0.05).
4. The fragility curve shows P-value degradation as studies are greedily removed (most impactful first).
| Cook's D | Di = (βfull − β−i)² / Var(β). Threshold: 4/k. (Viechtbauer & Cheung 2010, Res Synth Methods) |
| DFBETAS | (βfull − β−i) / SE−i. Threshold: |DFBETAS| > 2/√k. Standardized influence on the pooled estimate. |
| Leverage (hi) | wi* / Σwj* where w* = 1/(vi + τ²). High leverage = study dominates the pooled weight. |
| Studentized Residual | ri* = (yi − β) / √((vi + τ²)(1 − hi)). |r*| > 2 suggests outlier. |
| Baujat Plot | X = study's contribution to Q, Y = influence on pooled estimate. Upper-right studies are both heterogeneous and influential. (Baujat et al. 2002, Stat Med) |
| GOSH | Graphical Overview of Study Heterogeneity. Pools all 2k−1 subsets (capped at 1000 random subsets for k>15). Bimodality suggests subgroups. |
| Credibility Ceiling | Max bias probability u where conclusion holds with inflated variance vi + u²·θi². (Ioannidis 2017, J Clin Epidemiol) |
| Prediction Interval | β ± tk−2 · √(SE² + τ²). Range of true effects expected in a new study setting. |
| Galbraith (Radial) Plot | X = 1/SE (precision), Y = effect/SE (Z-score). Slope of regression through origin = pooled estimate. Studies outside ±1.96 band are outliers. (Galbraith 1988) |
| Doi Plot & LFK Index | Alternative to funnel plot. X = Z-score, Y = |Z|. LFK index: |LFK| < 1 none, 1–2 minor, > 2 major asymmetry. (Furuya-Kanamori 2018) |
| E-value | Minimum confounding strength to explain away the effect. E = RR + √(RR × (RR − 1)). Higher = more robust. (VanderWeele & Ding 2017) |
| Henmi-Copas CI | Publication bias-adjusted CI. Wider CI = fragile to selective reporting. (Henmi & Copas 2010, Stat Med) |
| Power Analysis | Post-hoc power and prospective sample size. Reports detectable effect at 80% power and studies needed for a target effect. |
| Robust MA (Huber) | Huber M-estimator with k=1.345. Iteratively downweights outliers. Compares with standard DL estimate. |
| Sensitivity to Measure | Reinterprets log-OR as log-RR or SMD (and vice versa). Shows conclusion stability across effect measures. Uses √3/π ≈ 0.5513. |
| Permutation Test | Permutes signs of effect estimates to test heterogeneity. More accurate than χ² for small k. 2k exhaustive (k≤20) or 1000 random with seeded PRNG. (Higgins & Thompson 2004) |
| REML Estimation | Restricted Maximum Likelihood via Fisher scoring. Less biased than DL for small k. Q-profile CI for τ². (Viechtbauer 2005, Stat Med) |
| Knapp-Hartung | Replaces z-based CI with tk−1-based CI using adjusted variance: seKH = se × √max(1, Q/(k−1)). Often substantially wider. (Knapp & Hartung 2003) |
| Profile Likelihood CI | Grid search (200 points) over beta. Bounds where −2LL drops by χ²1 = 3.841. Typically asymmetric. (Hardy & Thompson 1996) |
| HKSJ Prediction Interval | Most conservative PI: β ± tk−2 × √(seKH² + τ²REML). Combines Knapp-Hartung variance with REML heterogeneity. |
| Fragility Direction | Classifies each LOO reversal as weight-driven (large study), effect-driven (extreme outlier), or both. Explains WHY the MA is fragile. |
| Temporal Analysis | Cumulative MA in input order. Tracks pooled estimate and RFI evolution as evidence accumulates. Shows whether fragility was always present or developed with specific studies. |