Mahmood Ahmad
Tahir Heart Institute
author@example.com

CT.gov Structural Missingness

What information disappears from ClinicalTrials.gov even before one asks whether results were posted? We analysed the March 29, 2026 full-registry snapshot and quantified structural missingness in publication links, IPD statements, detailed descriptions, locations, and outcome fields across sponsor groups. The source universe included 578,109 studies, allowing field-level omission rates and sponsor-specific sparsity patterns to be estimated without sampling. Across the full registry, 63.4 percent of records lacked publication links, 48.3 percent lacked IPD sharing statements, 32.7 percent lacked detailed descriptions, and 10.2 percent lacked locations. Structural sparsity was not evenly distributed: industry remained heavily affected, NIH had the highest average hiddenness score among named sponsor classes, and UNKNOWN mostly reflected malformed metadata. Missingness therefore extends beyond results reporting into the descriptive fields needed for interpretation, replication, and scrutiny, with the loss being less context for appraisal, replication, accountability, and public scrutiny across therapeutic areas. These metrics capture registry-visible information loss rather than proven intent to conceal.

Outside Notes

Type: methods
Primary estimand: Field-level structural missingness across the full registry
App: CT.gov Structural Missingness dashboard
Data: 578,109 ClinicalTrials.gov records from the March 29, 2026 full-registry snapshot
Code: https://github.com/mahmood726-cyber/ctgov-structural-missingness
Version: 1.0.0
Validation: FULL REGISTRY RUN

References

1. ClinicalTrials.gov API v2. National Library of Medicine. Accessed March 29, 2026.
2. Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement. BMJ. 2021;372:n71.
3. Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Introduction to Meta-Analysis. 2nd ed. Wiley; 2021.

AI Disclosure

This work represents a compiler-generated evidence micro-publication built from structured registry data and deterministic summary code. AI was used as a constrained coding and drafting assistant for interface generation, packaging, and prose refinement, not as an autonomous author. The analytical choices, interpretation, and final outputs were reviewed by the author, who takes responsibility for the content.
