Public

Skipped

Papers paperiswrong deliberately did not reproduce. Refusal transparency completes the self-correction quadrant: /validator shows the C1/C2 gates that block unjustified WRONG verdicts; /anchors catches runtime drift; /legal/retractions enumerates historical false positives. This page completes the picture: papers we explicitly did not claim a verdict on.

Live counts

  • Skipped (total)
    3
  • Not attempted
    3
  • Out of budget
    0

Skipped papers

Why these were skipped

  • Gated weights. Models behind a HuggingFace license gate (Llama 2, Gemma, etc.) cannot be loaded inside the hermetic Modal sandbox because the platform contract forbids platform-wide HF_TOKEN secrets (PRD §18.X.1).
  • Methodological mismatch. Papers where the reported headline depends on a measurement protocol the v0.1 platform cannot reproduce honestly (custom fine-tuned checkpoint not released, evaluation on internal benchmarks, etc.) get a not_attempted stub rather than a misleading PARTIAL.
  • Out of budget. The v0.1 automated reproduction budget caps Modal walltime and GPU spend per paper. Models that exceed that cap are queued out and the row carries an out_of_budget status. A principal can manually un-block by re-running with an elevated budget; that future re-run, when it lands, will replace the row.