Public

Drivers

Every reproduction driver paperiswrong has wired. One row per scripts/run-reproduction-*.ts file, with the paper, the HuggingFace checkpoint, the PROTOCOL_MATCH tier (exact / proxy / unknown) the driver declared, and the status (active, retracted, closed- weights, scaffolding). Driver-centric — complements /api/v1/papers (paper-centric) and /api/v1/verdicts (event-centric).

Live counts

  • Total
    72
  • Active
    65
  • Retracted
    4
  • Other
    3
Protocol-match histogram (drivers): exact 6 · proxy 44 · unknown 15 · unspecified 7

Driver counts can diverge from verdict counts. The numbers above are driver-side — they answer “what protocol tier has paperiswrong wired?”. The matching verdict-side counts at /api/v1/validator answer “how many measured verdicts currently land in each tier?”. The gap is honest signal: it means a driver is wired but hasn't produced a verdict yet, or produced one under a different tier (e.g. MobileNetV3 declares exact on the ImageNet branch but falls back to proxy on Imagenette in the CI-runnable path).

Catalogue

DriverPaperModelProtocolStatusAgent versionSource
albert
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
textattack/albert-base-v2-MRPCproxyACTIVEv0.1.0-albert-mrpc-microslicesource ↗
align
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
kakaobrain/align-baseproxyACTIVEv0.1.0b-align-imagenette-3slice100source ↗
bart
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
facebook/bart-large-cnnexactACTIVEv0.1.0-bart-cnndm-200slicesource ↗
bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
textattack/bert-base-uncased-SST-2exactACTIVEv0.1.0-bert-sst2-3slice100source ↗
bigbird
Big Bird: Transformers for Longer Sequences
google/bigbird-roberta-baseunknownACTIVEv0.1.0-bigbird-wikitext2-3slice6source ↗
bitnet
BitNet b1.58 2B4T Technical Report
microsoft/bitnet-b1.58-2B-4TproxyACTIVEv0.1.0-bitnet-winogrande-microslicesource ↗
blip2
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Salesforce/blip2-opt-2.7bproxyACTIVEv0.2.0-blip2-flickr30k-beam5-n100source ↗
bloom
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
bigscience/bloom-560mproxyACTIVEv0.1.0-bloom-lambada-microslicesource ↗
clip
Learning Transferable Visual Models From Natural Language Supervision
openai/clip-vit-base-patch16proxyACTIVEv0.1.0-clip-cifar10-3slice100source ↗
codebert
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
microsoft/codebert-baseRETRACTEDv0.1.0-codebert-csn-python-mrr-3slicesource ↗
codellama
Code Llama: Open Foundation Models for Code
codellama/CodeLlama-7b-Python-hfunknownACTIVEv0.1.0-codellama-pythonppl-microslicesource ↗
convnext
A ConvNet for the 2020s
facebook/convnext-tiny-224proxyACTIVEv0.1.0-convnext-imagenet-microslicesource ↗
deberta-v2
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
microsoft/deberta-large-mnliexactACTIVEv0.1.0-deberta-v2-mnli-microslicesource ↗
deberta
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanliproxyACTIVEv0.1.0-deberta-mnli-microslicesource ↗
deepseek-coder
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
deepseek-ai/deepseek-coder-1.3b-baseunknownACTIVEv0.1.0-deepseek-coder-pythonppl-microslicesource ↗
deepseek-r1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BunknownACTIVEv0.1.0-deepseek-r1-winogrande-microslicesource ↗
dino
Emerging Properties in Self-Supervised Vision Transformers
facebook/dino-vitb16proxyACTIVEv0.1.1-dino-imagenette-knnsource ↗
dinov2
DINOv2: Learning Robust Visual Features without Supervision
facebook/dinov2-baseproxyACTIVEv0.1.1-dinov2-imagenette-knnsource ↗
distilbart
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
sshleifer/distilbart-cnn-12-6unknownACTIVEv0.1.0-distilbart-cnndm-200slicesource ↗
distilbert
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
distilbert-base-uncased-finetuned-sst-2-englishproxyACTIVEv0.1.0-distilbert-sst2-microslicesource ↗
efficientnet
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
google/efficientnet-b0proxyACTIVEv0.1.0-efficientnet-microslicesource ↗
electra
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
howey/electra-base-mnliproxyACTIVEv0.1.0-electra-mnli-microslicesource ↗
falcon
The Falcon Series of Open Language Models
tiiuae/falcon-7bproxyACTIVEv0.1.0-falcon-hellaswag-microslicesource ↗
flan-t5
Scaling Instruction-Finetuned Language Models
google/flan-t5-largeproxyACTIVEv0.1.0-flan-t5-mmlu-microslicesource ↗
gemma
Gemma: Open Models Based on Gemini Research and Technology
unsloth/gemma-2bproxyACTIVEv0.1.0-gemma-hellaswag-microslicesource ↗
internlm2
InternLM2 Technical Report
internlm/internlm2-chat-1_8bunknownACTIVEv0.1.0-internlm2-winogrande-microslicesource ↗
llama2
Llama 2: Open Foundation and Fine-Tuned Chat Models
meta-llama/Llama-2-7b-hfproxyACTIVEv0.1.0-llama2-hellaswag-microslicesource ↗
lora
LoRA: Low-Rank Adaptation of Large Language Models
FacebookAI/roberta-baseproxyACTIVEv0.1.0-lora-mrpc-microslicesource ↗
mamba
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
state-spaces/mamba-130m-hfproxyACTIVEv0.1.0-mamba-wikitext2-3slice8source ↗
minicpm
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
openbmb/MiniCPM-2B-sft-bf16proxyACTIVEv0.1.0-minicpm-mmlu5shot-microslicesource ↗
mistral
Mistral 7B
mistralai/Mistral-7B-v0.1proxyACTIVEv0.1.0-mistral-hellaswag-microslicesource ↗
mobilebert
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
typeform/mobilebert-uncased-mnliproxyACTIVEv0.1.0-mobilebert-mnli-microslicesource ↗
mobilenet-v3
Searching for MobileNetV3
timm/tf_mobilenetv3_large_100.in1kexactACTIVEv0.1.0-mobilenet-v3-large-microslicesource ↗
olmo
OLMo: Accelerating the Science of Language Models
allenai/OLMo-1B-hfRETRACTEDv0.1.1-olmo-not-attempted-stubsource ↗
olmo2
2 OLMo 2 Furious
allenai/OLMo-2-1124-7B-InstructunknownACTIVEv0.1.0-olmo2-winogrande-microslicesource ↗
olmoe
OLMoE: Open Mixture-of-Experts Language Models
allenai/OLMoE-1B-7B-0125-InstructunknownACTIVEv0.1.0-olmoe-winogrande-microslicesource ↗
openelm
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
apple/OpenELM-270MproxyACTIVEv0.1.0-openelm-winogrande-microslicesource ↗
opt
OPT: Open Pre-trained Transformer Language Models
facebook/opt-1.3bproxyACTIVEv0.1.0-opt-lambada-microslicesource ↗
palm2
PaLM 2 Technical Report
CLOSED WEIGHTSv0.1.0-palm2-not-attemptedsource ↗
phi
Textbooks Are All You Need II: phi-1.5 technical report
microsoft/phi-1_5proxyACTIVEv0.1.0-phi-winogrande-microslicesource ↗
phi1
Textbooks Are All You Need
microsoft/phi-1RETRACTEDv0.1.0-phi1-piqa-microslicesource ↗
phi3
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
microsoft/Phi-3-mini-4k-instructunknownACTIVEv0.1.0-phi3-winogrande-microslicesource ↗
phi4-mini
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
microsoft/Phi-4-mini-instructproxyACTIVEv0.1.0-phi4-mini-mmlu5shot-microslicesource ↗
pythia-14
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
EleutherAI/pythia-1.4bproxyACTIVEv0.1.0-pythia14-lambada-microslicesource ↗
pythia
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
EleutherAI/pythia-410mproxyACTIVEv0.1.0-pythia-lambada-microslicesource ↗
qwen2
Qwen2 Technical Report
Qwen/Qwen2-0.5BunknownACTIVEv0.1.0-qwen2-lambada-microslicesource ↗
qwen25
Qwen2.5 Technical Report
Qwen/Qwen2.5-0.5B-InstructunknownACTIVEv0.1.0-qwen25-winogrande-microslicesource ↗
qwen3-14b
Qwen3 Technical Report
Qwen/Qwen3-14B-BaseproxyACTIVEv0.1.0-qwen3-14b-mmlu5shot-microslicesource ↗
qwen3-17b
Qwen3 Technical Report
Qwen/Qwen3-1.7B-BaseproxyACTIVEv0.1.0-qwen3-17b-mmlu5shot-microslicesource ↗
qwen3-30b-a3b
Qwen3 Technical Report
Qwen/Qwen3-30B-A3B-BaseproxyACTIVEv0.1.0-qwen3-30b-a3b-mmlu5shot-microslicesource ↗
qwen3-32b
Qwen3 Technical Report
Qwen/Qwen3-32BproxyACTIVEv0.1.0-qwen3-32b-mmlu5shot-microslicesource ↗
qwen3-4b
Qwen3 Technical Report
Qwen/Qwen3-4B-BaseproxyACTIVEv0.1.0-qwen3-4b-mmlu5shot-microslicesource ↗
qwen3-8b
Qwen3 Technical Report
Qwen/Qwen3-8B-BaseproxyACTIVEv0.1.0-qwen3-8b-mmlu5shot-microslicesource ↗
qwen3
Qwen3 Technical Report
Qwen/Qwen3-0.6B-BaseproxyACTIVEv0.1.0-qwen3-mmlu5shot-microslicesource ↗
resnet
Deep Residual Learning for Image Recognition
microsoft/resnet-50proxyACTIVEv0.1.0-resnet-microslicesource ↗
roberta
RoBERTa: A Robustly Optimized BERT Pretraining Approach
FacebookAI/roberta-large-mnliproxyACTIVEv0.1.0-roberta-mnli-microslicesource ↗
rwkv4
RWKV: Reinventing RNNs for the Transformer Era
RWKV/rwkv-4-1b5-pileexactACTIVEv0.1.0-rwkv4-winogrande-microslicesource ↗
sbert
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
sentence-transformers/bert-base-nli-stsb-mean-tokensproxyACTIVEv0.2.0-sbert-stsb-test-3slice-table2source ↗
sd15
High-Resolution Image Synthesis with Latent Diffusion Models
stabilityai/stable-diffusion-v1-5SCAFFOLDINGv0.1.0-sd15-step-count-microslicesource ↗
sdxl
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
stabilityai/stable-diffusion-xl-base-1.0SCAFFOLDINGv0.1.0-sdxl-clip-score-microslicesource ↗
smollm2
SmolLM2: When Smol Goes Big — Data-Centric Training of a Small Language Model
HuggingFaceTB/SmolLM2-1.7B-InstructunknownACTIVEv0.1.0-smollm2-winogrande-microslicesource ↗
stablelm2
Stable LM 2 1.6B Technical Report
stabilityai/stablelm-2-1_6b-chatunknownACTIVEv0.1.0-stablelm2-winogrande-microslicesource ↗
starcoder
StarCoder: may the source be with you!
bigcode/starcoderbase-1bunknownACTIVEv0.1.0-starcoder-pythonppl-microslicesource ↗
swinv2
Swin Transformer V2: Scaling Up Capacity and Resolution
microsoft/swinv2-tiny-patch4-window8-256proxyACTIVEv0.1.0-swinv2-imagenet-microslicesource ↗
t5
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
valhalla/t5-small-glue-mnliproxyACTIVEv0.1.0-t5-mnli-microslicesource ↗
tinyllama
TinyLlama: An Open-Source Small Language Model
TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3TproxyACTIVEv0.1.0-tinyllama-hellaswag-microslicesource ↗
vit
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
aaraki/vit-base-patch16-224-in21k-finetuned-cifar10proxyACTIVEv0.1.0-vit-cifar10-3slice100source ↗
wav2vec2
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
facebook/wav2vec2-base-960hexactACTIVEv0.1.0-wav2vec2-librispeech-3slice16source ↗
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
openai/whisper-tiny.enproxyACTIVEv0.1.0-whisper-librispeech-3slice16source ↗
xlm-r
Unsupervised Cross-lingual Representation Learning at Scale
joeddav/xlm-roberta-large-xnliRETRACTEDv0.1.0-xlm-r-xnli-microslicesource ↗
xlnet
XLNet: Generalized Autoregressive Pretraining for Language Understanding
textattack/xlnet-base-cased-MNLIproxyACTIVEv0.1.0-xlnet-mnli-microslicesource ↗
yi
Yi: Open Foundation Models by 01.AI
01-ai/Yi-6BunknownACTIVEv0.1.0-yi-lambada-microslicesource ↗

Why a driver list (and not just /papers)

The driver list and the paper list can diverge. A driver is wired the moment the file lands in scripts/run-reproduction-*.ts — that's before any reproduction job has fired. Newly-wired drivers may not have a verdict yet; retracted drivers are kept as not_attempted stubs so the structural lint (tests/unit/scripts/validator-wiring-lint.test.ts) still passes. Scaffolding drivers (image-gen — Stable Diffusion, SDXL) never publish a scalar verdict at all.

Surfacing all four states (active, retracted, closed-weights, scaffolding) lets a visitor see the platform's full attempt surface — distinct from its verdict surface. Combined with /skipped (papers in the corpus that we deliberately did not reproduce), this is the platform's honest answer to “what does paperiswrong cover and what doesn't it cover”.