Manuscript In Revision

Calibration-to-Deployment Mismatch in HIV Prevention Trials

How structural censoring biases counterfactual incidence estimates in PURPOSE 1, PURPOSE 2, and other RITA-based PrEP efficacy trials.

01 — The Problem

A Closed-System Assumption Inside an Open-System Population

Cross-sectional HIV incidence estimation using recent-infection testing algorithms (RITAs) — the Kassanjee estimator, extended with delta-method variance by Gao et al. — is the analytic backbone of counterfactual-controlled PrEP efficacy trials. It supplied the background incidence (bHIV) for the pivotal PURPOSE 1 (NCT04994509) and PURPOSE 2 (NCT04925752) trials of long-acting injectable lenacapavir.

The estimator's validity rests on a single assumption: that every individual infected within the recency window [0, T] is observable at screening — alive, non-incarcerated, housed, and presenting for testing with uniform probability. This is a convenience of calibration, not a biological claim. In populations experiencing structural censoring — overdose mortality, incarceration, displacement, carceral disruption of healthcare, intimate partner violence — the assumption fails directionally and quantifiably. The failure mode is not random noise. It is systematic deflation of estimated background incidence, with magnitude correlated with the very structural vulnerabilities trial designs ostensibly aim to serve.

34
High-burden US MSAs analyzed (AIDSVu 2023)
8.7–27.3%
Kassanjee denominator inflation, empirical range
3.1%
Max IRR attenuation, highest-severity empirical cohort
02 — The Mechanism of Bias

Survival-Biased Effective MDRI

Let γ(t) denote the instantaneous hazard of structural censoring — the per-unit-time probability of transition into an unobservable state. The effective mean duration of recent infection, conditional on the population's structural hazard profile, is the joint probability that an individual infected at age 0 (i) tests recent at age t and (ii) remains observable at screening:

Ω*(γ)  =  ∫0T PR(t) Sc(t) dt
Distribution-free: requires no parametric form for γ or PR

Under a piecewise-constant hazard with the standard exponential approximation PR(t) ≈ P0·exp(−t/τ) where τ ≈ 173 days for the Sedia LAg-EIA at conventional ODn < 1.5, the integral collapses to a closed-form correction:

Ω*(γ)  ≈  Ω / (1 + γτ)
For γ ∈ [10⁻⁴, 10⁻³] /day and τ = 173d: Ω*/Ω spans 0.85 to 0.98 — deflation of 2% to 15%

Substituting Ω*(γ) into the Gao 2021 cross-sectional incidence estimator yields a reported background incidence that is deflated by the factor Ω*/Ω. The intervention arm is also subject to attenuation, through a different mechanism — infections during longitudinal follow-up are detected only if the participant remains observable through the next scheduled HIV test. The reported IRR therefore relates to the true IRR via a joint bias factor:

IRRreported  =  IRRtrue · ρint / ρscreen  ≡  IRRtrue · BIRR(γ, r)
BIRR < 1 whenever ρint < ρscreen — interventions appear artificially superior
03 — The Trial-Design Lock

The 90-Day Eligibility Criterion Is the Selection Mechanism

Phase 3 cross-sectional incidence cohorts in contemporary PrEP trials routinely exclude individuals with prior HIV testing within a specified preceding interval. PURPOSE 1 and PURPOSE 2 use identical language: HIV-1 status unknown at screening and no prior HIV-1 testing within the last 3 months. The conventional justification is assay-calibration integrity — recency-biomarker interpretation may be disrupted by recent immune responses or seroconversion uncertainty.

The structural consequence is different. The Incidence Phase cohort is explicitly sampled on testing engagement — the cohort is constructed to exclude individuals whose testing interval is shorter than 90 days. This selection is not incidental. It is a deliberate feature of trial design, and it operates precisely on the behavioral axis that correlates most strongly with the competing-risk hazard γ.

Eligibility selection

Excludes frequent testers; retains the low-γ tail of the at-risk population

Bimodal testing-behavior partition · 90-day inclusion threshold · Selection amplification factor φ ≈ 1.5× (theoretical lower bound, Supplement §S6)

Realized cohort composition

Departs further toward low-γ tail than the design-phase protocol predicts

Empirical PURPOSE 2 enrollment shift · Bimodal-partition lower bound conservative · True bias magnitudes in deployment populations exceed reported values

Calibration-to-deployment mismatch

Validation-cohort Ω applied to a cohort experiencing γ > 0

Bias is structurally guaranteed, not merely permitted · Direction is systematic under retention schemes where higher-γ cohorts also experience lower retention — the empirically documented pattern in real-world LAI-PrEP deployment
04 — Geographic Findings

34 MSAs, AIDSVu 2023, Late-Diagnosis as Empirical Proxy

We applied the framework to 34 high-burden US metropolitan statistical areas using publicly available AIDSVu 2023 surveillance data, with late-diagnosis percentage as the empirical proxy for testing-avoidance hazard. Site-level γ ranged across two orders of magnitude. Three severity tiers emerged in the empirical distribution, with a monotone scaling of the Kassanjee correction factor in late-diagnosis percentage.

Higher-bias cluster (Hartford, New Haven, San Juan, Bridgeport)Net bias 3.1%
Medium-bias cluster (Atlanta, Houston, New Orleans, Charleston)Net bias 0.8%
Lower-bias cluster (Atlanta, Kansas City, Milwaukee, El Paso)Net bias 0.1%

The trial-site overlay matters. Hartford and New Haven are PURPOSE 2 sites; Atlanta and Milwaukee are PURPOSE 2 sites with substantially lower γ; Boston (PrEP4U / MGH) sits at the low extreme. PURPOSE 4 sites — the PWID-focused arm — cluster in the high-γ tail. The bias is therefore not uniform across the trial footprint: it is concentrated precisely in the populations whose protection from HIV the long-acting injectable program is meant to demonstrate.

BIRR
∈ [0.969, 0.999] across the 34-MSA empirical range
±6%
95% CI on BIRR at empirical midpoint (Supp §S1.4)
0.992
Point estimate at representative midpoint — robustly distinguishable from unity
05 — Implications

The Estimator's Well-Posed Regime Has an Operational Envelope

The Kassanjee/Gao cross-sectional incidence estimator produces systematically biased point estimates when applied to populations with elevated structural censoring, and the bias is structurally guaranteed — rather than merely possible — when trial-design eligibility criteria select the Incidence Phase cohort on the same testing-engagement axis that drives the competing-risk hazard. The bias is bounded within the empirical AIDSVu range and reverts directionally outside it, providing an operational-envelope characterization of the estimator's well-posed regime.

Correction does not require new data infrastructure or proprietary trial-level information. It requires explicit modeling of population-specific hazard using public surveillance data — late-diagnosis percentage, AIDSVu MSA tables, or analogous structural-hazard proxies. The framework presented here is one such correction, fully reproducible from public data, and generalizes to any RITA-based trial with analogous eligibility structure.

Variance propagation gives ~16% added relative uncertainty on λ̂₀ from γ-estimation uncertainty (σγ/γ = 0.3) at typical cross-sectional sample sizes — non-trivial but typically not the dominant uncertainty term, which is driven by Nrec · Sensitivity scenarios in Supplement §S2 demonstrate the directional claim holds robustly across the empirical AIDSVu range, with sign-flip occurring only at retention values (r > 0.93) outside the realized PURPOSE 2 trial range.

Resources

Explore the Research

Slide 1
1 / 15

↓ Download PPTX

Figure 1 — Phase 1c-v2: Site-Level Kassanjee Survival-Bias Correction

Figure 1: Site-level Kassanjee survival-bias correction across 34 AIDSVu MSAs with LEN-program trial footprint

(A) Site-level γ across 34 AIDSVu MSAs, overlaid with LEN-program trial footprint and Meyer/Kamitani severity parameterization. (B) Incidence deflation as a function of γ, with severity-scenario anchors. (C) Abstracted severity-sensitivity panel with no city attribution.

Figure S1 — Supplementary detail: severity-tier projection

Supplementary Figure S1

Figure S2 — Supplementary detail: sensitivity panel

Supplementary Figure S2
The Structural Blind Spot: Why HIV Prevention Trials Overstate Success
Manuscript Citation

Demidont AC. Calibration-to-Deployment Mismatch in HIV Prevention Trials: How Structural Censoring Biases Counterfactual Incidence Estimates. Manuscript in revision, April 25, 2026. ORCID: 0000-0002-9216-8569.

BibTeX

@unpublished{Demidont2026cdm,
  author = {Demidont, A.C.},
  title = {Calibration-to-Deployment Mismatch in HIV Prevention Trials: How Structural Censoring Biases Counterfactual Incidence Estimates},
  note = {Manuscript in revision},
  year = {2026},
  month = {April},
  institution = {Nyx Dynamics LLC}
}