Live counts
- Total72
- Active65
- Retracted4
- Other3
Driver counts can diverge from verdict counts. The numbers above are driver-side — they answer “what protocol tier has paperiswrong wired?”. The matching verdict-side counts at /api/v1/validator answer “how many measured verdicts currently land in each tier?”. The gap is honest signal: it means a driver is wired but hasn't produced a verdict yet, or produced one under a different tier (e.g. MobileNetV3 declares exact on the ImageNet branch but falls back to proxy on Imagenette in the CI-runnable path).
Catalogue
| Driver | Paper | Model | Protocol | Status | Agent version | Source |
|---|---|---|---|---|---|---|
| albert | ALBERT: A Lite BERT for Self-supervised Learning of Language Representations | textattack/albert-base-v2-MRPC | proxy | ACTIVE | v0.1.0-albert-mrpc-microslice | source ↗ |
| align | Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision | kakaobrain/align-base | proxy | ACTIVE | v0.1.0b-align-imagenette-3slice100 | source ↗ |
| bart | BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | facebook/bart-large-cnn | exact | ACTIVE | v0.1.0-bart-cnndm-200slice | source ↗ |
| bert | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | textattack/bert-base-uncased-SST-2 | exact | ACTIVE | v0.1.0-bert-sst2-3slice100 | source ↗ |
| bigbird | Big Bird: Transformers for Longer Sequences | google/bigbird-roberta-base | unknown | ACTIVE | v0.1.0-bigbird-wikitext2-3slice6 | source ↗ |
| bitnet | BitNet b1.58 2B4T Technical Report | microsoft/bitnet-b1.58-2B-4T | proxy | ACTIVE | v0.1.0-bitnet-winogrande-microslice | source ↗ |
| blip2 | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | Salesforce/blip2-opt-2.7b | proxy | ACTIVE | v0.2.0-blip2-flickr30k-beam5-n100 | source ↗ |
| bloom | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | bigscience/bloom-560m | proxy | ACTIVE | v0.1.0-bloom-lambada-microslice | source ↗ |
| clip | Learning Transferable Visual Models From Natural Language Supervision | openai/clip-vit-base-patch16 | proxy | ACTIVE | v0.1.0-clip-cifar10-3slice100 | source ↗ |
| codebert | CodeBERT: A Pre-Trained Model for Programming and Natural Languages | microsoft/codebert-base | — | RETRACTED | v0.1.0-codebert-csn-python-mrr-3slice | source ↗ |
| codellama | Code Llama: Open Foundation Models for Code | codellama/CodeLlama-7b-Python-hf | unknown | ACTIVE | v0.1.0-codellama-pythonppl-microslice | source ↗ |
| convnext | A ConvNet for the 2020s | facebook/convnext-tiny-224 | proxy | ACTIVE | v0.1.0-convnext-imagenet-microslice | source ↗ |
| deberta-v2 | DeBERTa: Decoding-enhanced BERT with Disentangled Attention | microsoft/deberta-large-mnli | exact | ACTIVE | v0.1.0-deberta-v2-mnli-microslice | source ↗ |
| deberta | DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing | MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli | proxy | ACTIVE | v0.1.0-deberta-mnli-microslice | source ↗ |
| deepseek-coder | DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | deepseek-ai/deepseek-coder-1.3b-base | unknown | ACTIVE | v0.1.0-deepseek-coder-pythonppl-microslice | source ↗ |
| deepseek-r1 | DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | unknown | ACTIVE | v0.1.0-deepseek-r1-winogrande-microslice | source ↗ |
| dino | Emerging Properties in Self-Supervised Vision Transformers | facebook/dino-vitb16 | proxy | ACTIVE | v0.1.1-dino-imagenette-knn | source ↗ |
| dinov2 | DINOv2: Learning Robust Visual Features without Supervision | facebook/dinov2-base | proxy | ACTIVE | v0.1.1-dinov2-imagenette-knn | source ↗ |
| distilbart | BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | sshleifer/distilbart-cnn-12-6 | unknown | ACTIVE | v0.1.0-distilbart-cnndm-200slice | source ↗ |
| distilbert | DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | distilbert-base-uncased-finetuned-sst-2-english | proxy | ACTIVE | v0.1.0-distilbert-sst2-microslice | source ↗ |
| efficientnet | EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks | google/efficientnet-b0 | proxy | ACTIVE | v0.1.0-efficientnet-microslice | source ↗ |
| electra | ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | howey/electra-base-mnli | proxy | ACTIVE | v0.1.0-electra-mnli-microslice | source ↗ |
| falcon | The Falcon Series of Open Language Models | tiiuae/falcon-7b | proxy | ACTIVE | v0.1.0-falcon-hellaswag-microslice | source ↗ |
| flan-t5 | Scaling Instruction-Finetuned Language Models | google/flan-t5-large | proxy | ACTIVE | v0.1.0-flan-t5-mmlu-microslice | source ↗ |
| gemma | Gemma: Open Models Based on Gemini Research and Technology | unsloth/gemma-2b | proxy | ACTIVE | v0.1.0-gemma-hellaswag-microslice | source ↗ |
| internlm2 | InternLM2 Technical Report | internlm/internlm2-chat-1_8b | unknown | ACTIVE | v0.1.0-internlm2-winogrande-microslice | source ↗ |
| llama2 | Llama 2: Open Foundation and Fine-Tuned Chat Models | meta-llama/Llama-2-7b-hf | proxy | ACTIVE | v0.1.0-llama2-hellaswag-microslice | source ↗ |
| lora | LoRA: Low-Rank Adaptation of Large Language Models | FacebookAI/roberta-base | proxy | ACTIVE | v0.1.0-lora-mrpc-microslice | source ↗ |
| mamba | Mamba: Linear-Time Sequence Modeling with Selective State Spaces | state-spaces/mamba-130m-hf | proxy | ACTIVE | v0.1.0-mamba-wikitext2-3slice8 | source ↗ |
| minicpm | MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies | openbmb/MiniCPM-2B-sft-bf16 | proxy | ACTIVE | v0.1.0-minicpm-mmlu5shot-microslice | source ↗ |
| mistral | Mistral 7B | mistralai/Mistral-7B-v0.1 | proxy | ACTIVE | v0.1.0-mistral-hellaswag-microslice | source ↗ |
| mobilebert | MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices | typeform/mobilebert-uncased-mnli | proxy | ACTIVE | v0.1.0-mobilebert-mnli-microslice | source ↗ |
| mobilenet-v3 | Searching for MobileNetV3 | timm/tf_mobilenetv3_large_100.in1k | exact | ACTIVE | v0.1.0-mobilenet-v3-large-microslice | source ↗ |
| olmo | OLMo: Accelerating the Science of Language Models | allenai/OLMo-1B-hf | — | RETRACTED | v0.1.1-olmo-not-attempted-stub | source ↗ |
| olmo2 | 2 OLMo 2 Furious | allenai/OLMo-2-1124-7B-Instruct | unknown | ACTIVE | v0.1.0-olmo2-winogrande-microslice | source ↗ |
| olmoe | OLMoE: Open Mixture-of-Experts Language Models | allenai/OLMoE-1B-7B-0125-Instruct | unknown | ACTIVE | v0.1.0-olmoe-winogrande-microslice | source ↗ |
| openelm | OpenELM: An Efficient Language Model Family with Open Training and Inference Framework | apple/OpenELM-270M | proxy | ACTIVE | v0.1.0-openelm-winogrande-microslice | source ↗ |
| opt | OPT: Open Pre-trained Transformer Language Models | facebook/opt-1.3b | proxy | ACTIVE | v0.1.0-opt-lambada-microslice | source ↗ |
| palm2 | PaLM 2 Technical Report | — | — | CLOSED WEIGHTS | v0.1.0-palm2-not-attempted | source ↗ |
| phi | Textbooks Are All You Need II: phi-1.5 technical report | microsoft/phi-1_5 | proxy | ACTIVE | v0.1.0-phi-winogrande-microslice | source ↗ |
| phi1 | Textbooks Are All You Need | microsoft/phi-1 | — | RETRACTED | v0.1.0-phi1-piqa-microslice | source ↗ |
| phi3 | Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone | microsoft/Phi-3-mini-4k-instruct | unknown | ACTIVE | v0.1.0-phi3-winogrande-microslice | source ↗ |
| phi4-mini | Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs | microsoft/Phi-4-mini-instruct | proxy | ACTIVE | v0.1.0-phi4-mini-mmlu5shot-microslice | source ↗ |
| pythia-14 | Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling | EleutherAI/pythia-1.4b | proxy | ACTIVE | v0.1.0-pythia14-lambada-microslice | source ↗ |
| pythia | Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling | EleutherAI/pythia-410m | proxy | ACTIVE | v0.1.0-pythia-lambada-microslice | source ↗ |
| qwen2 | Qwen2 Technical Report | Qwen/Qwen2-0.5B | unknown | ACTIVE | v0.1.0-qwen2-lambada-microslice | source ↗ |
| qwen25 | Qwen2.5 Technical Report | Qwen/Qwen2.5-0.5B-Instruct | unknown | ACTIVE | v0.1.0-qwen25-winogrande-microslice | source ↗ |
| qwen3-14b | Qwen3 Technical Report | Qwen/Qwen3-14B-Base | proxy | ACTIVE | v0.1.0-qwen3-14b-mmlu5shot-microslice | source ↗ |
| qwen3-17b | Qwen3 Technical Report | Qwen/Qwen3-1.7B-Base | proxy | ACTIVE | v0.1.0-qwen3-17b-mmlu5shot-microslice | source ↗ |
| qwen3-30b-a3b | Qwen3 Technical Report | Qwen/Qwen3-30B-A3B-Base | proxy | ACTIVE | v0.1.0-qwen3-30b-a3b-mmlu5shot-microslice | source ↗ |
| qwen3-32b | Qwen3 Technical Report | Qwen/Qwen3-32B | proxy | ACTIVE | v0.1.0-qwen3-32b-mmlu5shot-microslice | source ↗ |
| qwen3-4b | Qwen3 Technical Report | Qwen/Qwen3-4B-Base | proxy | ACTIVE | v0.1.0-qwen3-4b-mmlu5shot-microslice | source ↗ |
| qwen3-8b | Qwen3 Technical Report | Qwen/Qwen3-8B-Base | proxy | ACTIVE | v0.1.0-qwen3-8b-mmlu5shot-microslice | source ↗ |
| qwen3 | Qwen3 Technical Report | Qwen/Qwen3-0.6B-Base | proxy | ACTIVE | v0.1.0-qwen3-mmlu5shot-microslice | source ↗ |
| resnet | Deep Residual Learning for Image Recognition | microsoft/resnet-50 | proxy | ACTIVE | v0.1.0-resnet-microslice | source ↗ |
| roberta | RoBERTa: A Robustly Optimized BERT Pretraining Approach | FacebookAI/roberta-large-mnli | proxy | ACTIVE | v0.1.0-roberta-mnli-microslice | source ↗ |
| rwkv4 | RWKV: Reinventing RNNs for the Transformer Era | RWKV/rwkv-4-1b5-pile | exact | ACTIVE | v0.1.0-rwkv4-winogrande-microslice | source ↗ |
| sbert | Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | sentence-transformers/bert-base-nli-stsb-mean-tokens | proxy | ACTIVE | v0.2.0-sbert-stsb-test-3slice-table2 | source ↗ |
| sd15 | High-Resolution Image Synthesis with Latent Diffusion Models | stabilityai/stable-diffusion-v1-5 | — | SCAFFOLDING | v0.1.0-sd15-step-count-microslice | source ↗ |
| sdxl | SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis | stabilityai/stable-diffusion-xl-base-1.0 | — | SCAFFOLDING | v0.1.0-sdxl-clip-score-microslice | source ↗ |
| smollm2 | SmolLM2: When Smol Goes Big — Data-Centric Training of a Small Language Model | HuggingFaceTB/SmolLM2-1.7B-Instruct | unknown | ACTIVE | v0.1.0-smollm2-winogrande-microslice | source ↗ |
| stablelm2 | Stable LM 2 1.6B Technical Report | stabilityai/stablelm-2-1_6b-chat | unknown | ACTIVE | v0.1.0-stablelm2-winogrande-microslice | source ↗ |
| starcoder | StarCoder: may the source be with you! | bigcode/starcoderbase-1b | unknown | ACTIVE | v0.1.0-starcoder-pythonppl-microslice | source ↗ |
| swinv2 | Swin Transformer V2: Scaling Up Capacity and Resolution | microsoft/swinv2-tiny-patch4-window8-256 | proxy | ACTIVE | v0.1.0-swinv2-imagenet-microslice | source ↗ |
| t5 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | valhalla/t5-small-glue-mnli | proxy | ACTIVE | v0.1.0-t5-mnli-microslice | source ↗ |
| tinyllama | TinyLlama: An Open-Source Small Language Model | TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T | proxy | ACTIVE | v0.1.0-tinyllama-hellaswag-microslice | source ↗ |
| vit | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | aaraki/vit-base-patch16-224-in21k-finetuned-cifar10 | proxy | ACTIVE | v0.1.0-vit-cifar10-3slice100 | source ↗ |
| wav2vec2 | wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations | facebook/wav2vec2-base-960h | exact | ACTIVE | v0.1.0-wav2vec2-librispeech-3slice16 | source ↗ |
| whisper | Robust Speech Recognition via Large-Scale Weak Supervision | openai/whisper-tiny.en | proxy | ACTIVE | v0.1.0-whisper-librispeech-3slice16 | source ↗ |
| xlm-r | Unsupervised Cross-lingual Representation Learning at Scale | joeddav/xlm-roberta-large-xnli | — | RETRACTED | v0.1.0-xlm-r-xnli-microslice | source ↗ |
| xlnet | XLNet: Generalized Autoregressive Pretraining for Language Understanding | textattack/xlnet-base-cased-MNLI | proxy | ACTIVE | v0.1.0-xlnet-mnli-microslice | source ↗ |
| yi | Yi: Open Foundation Models by 01.AI | 01-ai/Yi-6B | unknown | ACTIVE | v0.1.0-yi-lambada-microslice | source ↗ |
Why a driver list (and not just /papers)
The driver list and the paper list can diverge. A driver is wired the moment the file lands in scripts/run-reproduction-*.ts — that's before any reproduction job has fired. Newly-wired drivers may not have a verdict yet; retracted drivers are kept as not_attempted stubs so the structural lint (tests/unit/scripts/validator-wiring-lint.test.ts) still passes. Scaffolding drivers (image-gen — Stable Diffusion, SDXL) never publish a scalar verdict at all.
Surfacing all four states (active, retracted, closed-weights, scaffolding) lets a visitor see the platform's full attempt surface — distinct from its verdict surface. Combined with /skipped (papers in the corpus that we deliberately did not reproduce), this is the platform's honest answer to “what does paperiswrong cover and what doesn't it cover”.