The Wall of Wrong

Every reproduction verdict, in reverse-chronological order — REPRODUCED, PARTIAL, and WRONG. The canonical timeline of our reproduction job. Updated daily.

“WRONG” is a technical term — it means a headline numerical claim in this paper did not reproduce on our reproduction job, within published tolerance. It does not mean the paper is intentionally misleading. Authors have the right of reply, prominently. See methodology →

Sibling corpus views: /drivers (every reproduction script), /skipped (papers we deliberately did not reproduce), /validator (C1/C2 gates), /legal/retractions (historical false positives).

50 verdicts
Friday, May 15
REPRODUCEDconf 0.80
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
· NAACL 2019 · cs.CL
0.9467
Evidence: job #b522a4a7-3bf9-4134-a085-11f88c0242f2No reply yetOpen paper →
PARTIALconf 0.50
Searching for MobileNetV3
· ICCV 2019 · cs.CV
0.9900
Evidence: job #926e1673-437b-447e-9f44-76339bfc50adNo reply yetOpen paper →
REPRODUCEDconf 0.85
RoBERTa: A Robustly Optimized BERT Pretraining Approach
· arXiv preprint · cs.CL
0.9053
Evidence: job #074d8c16-4f0f-46e3-a29a-41b99d271943No reply yetOpen paper →
REPRODUCEDconf 0.80
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
· NeurIPS 2019 EMC^2 Workshop · cs.CL
0.9150
Evidence: job #48d12fe7-4255-474f-b6de-b5b7e15ad297No reply yetOpen paper →
REPRODUCEDconf 0.80
Mistral 7B
· arXiv 2023 · cs.CL
0.7980
Evidence: job #1443722c-e65d-4154-a3ac-03f76b55df10No reply yetOpen paper →
REPRODUCEDconf 0.65
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
· COLM 2024 · cs.LG
34.8765
Evidence: job #014fa02f-a714-48b5-a8da-45137749dd8bNo reply yetOpen paper →
PARTIALconf 0.55
Stable LM 2 1.6B Technical Report
· arXiv 2024 · cs.CL
0.6064
Evidence: job #0025dbdf-99ba-4d60-ab41-cafb511594a6No reply yetOpen paper →
REPRODUCEDconf 0.80
Yi: Open Foundation Models by 01.AI
· arXiv 2024 · cs.CL
0.7077
Evidence: job #55abc116-6e50-473a-85e8-76b3b69e46f1No reply yetOpen paper →
PARTIALconf 0.55
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
· COLM 2024 · cs.CL
0.2923
Evidence: job #6bce556a-498b-48be-a9e9-49195f54e1d5No reply yetOpen paper →
PARTIALconf 0.55
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
· arXiv 2024 · cs.CL
0.6948
Evidence: job #58aae606-be57-47bf-a5a7-039fcbee3357No reply yetOpen paper →
PARTIALconf 0.55
OLMoE: Open Mixture-of-Experts Language Models
· arXiv 2024 · cs.CL
0.6466
Evidence: job #e1b8d2df-e39e-4104-92f1-382cf768c8e6No reply yetOpen paper →
PARTIALconf 0.55
Qwen2.5 Technical Report
· arXiv 2024 · cs.CL
0.5562
Evidence: job #58824325-3a3e-47f2-8802-305bf66ed9d6No reply yetOpen paper →
PARTIALconf 0.55
2 OLMo 2 Furious
· arXiv 2024 · cs.CL
0.6627
Evidence: job #60f64123-f4a9-4c58-af04-49f09be8a958No reply yetOpen paper →
PARTIALconf 0.55
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
· arXiv 2025 · cs.CL
0.5482
Evidence: job #8466b2f1-25b6-44b9-9f36-7fb4df494670No reply yetOpen paper →
PARTIALconf 0.55
SmolLM2: When Smol Goes Big — Data-Centric Training of a Small Language Model
· arXiv 2025 · cs.CL
0.6345
Evidence: job #372e6c77-520e-4b45-872a-117914aea7c5No reply yetOpen paper →
Thursday, May 14
PARTIALconf 0.45
Deep Residual Learning for Image Recognition
· CVPR 2016 · cs.CV
0.4200
Evidence: job #665acc4b-36c0-4983-96ad-062abcf151b6No reply yetOpen paper →
PARTIALconf 0.50
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
· ICML 2019 · cs.LG
0.9859
Evidence: job #cb39b59c-7830-4638-bf46-70351de50ebaNo reply yetOpen paper →
REPRODUCEDconf 0.85
XLNet: Generalized Autoregressive Pretraining for Language Understanding
· NeurIPS 2019 · cs.CL
0.8721
Evidence: job #84d63b3c-34f4-4431-8b9f-ee88013406b7No reply yetOpen paper →
REPRODUCEDconf 0.80
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
· EMNLP 2019 · cs.CL
0.8496
Evidence: job #268d5d4c-a413-44bc-8b0d-0961853e2026No reply yetOpen paper →
REPRODUCEDconf 0.80
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
· ICLR 2020 · cs.CL
0.9067
Evidence: job #afeb8751-d2e2-4715-8894-0212ee77c18cNo reply yetOpen paper →
REPRODUCEDconf 0.80
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
· JMLR 2020 · cs.LG
0.8150
Evidence: job #feb8e705-9ee6-447d-996d-59b11640d120No reply yetOpen paper →
PARTIALconf 0.60
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
· ACL 2020 · cs.CL
0.4214
Evidence: job #b50b798e-6162-447f-a132-fece39f31b00No reply yetOpen paper →
REPRODUCEDconf 0.80
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
· ICLR 2020 · cs.CL
0.8767
Evidence: job #1f3f931f-9111-4219-9174-463a16fae342No reply yetOpen paper →
REPRODUCEDconf 0.80
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
· ACL 2020 · cs.CL
0.8383
Evidence: job #ce4160be-1ada-41f5-84ac-5eca01b2bab0No reply yetOpen paper →
REPRODUCEDconf 0.85
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
· ICLR 2021 · cs.CL
0.9170
Evidence: job #5b886e85-05d7-42a4-bf85-47e40ba21d64No reply yetOpen paper →
PARTIALconf 0.50
Big Bird: Transformers for Longer Sequences
· NeurIPS 2020 · cs.LG
153.5130
Evidence: job #54887802-d5bf-438f-80a1-a85abc794423No reply yetOpen paper →
REPRODUCEDconf 0.80
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
· ICLR 2021 · cs.CV
0.9767
Evidence: job #6d9acf6e-3ce2-4867-b24e-45dc0c13f5e6No reply yetOpen paper →
PARTIALconf 0.55
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
· ICML 2021 · cs.CV
1.0000
Evidence: job #6fa6d49c-3d66-4b5c-9f4d-91e41aaa5616No reply yetOpen paper →
REPRODUCEDconf 0.75
Learning Transferable Visual Models From Natural Language Supervision
· ICML 2021 · cs.CV
0.8867
Evidence: job #1ef1a9e6-3e00-4ac4-b3d3-fdaa5e554655No reply yetOpen paper →
REPRODUCEDconf 0.80
Emerging Properties in Self-Supervised Vision Transformers
· ICCV 2021 · cs.CV
0.9733
Evidence: job #4fd89fa3-455e-44a7-bbfb-e6e9babada69No reply yetOpen paper →
REPRODUCEDconf 0.80
LoRA: Low-Rank Adaptation of Large Language Models
· ICLR 2022 · cs.CL
0.8900
Evidence: job #ed4c2625-4c44-4b9d-8120-7355d231a777No reply yetOpen paper →
REPRODUCEDconf 0.85
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
· ICLR 2023 · cs.CL
0.9134
Evidence: job #e35cb267-b959-4799-8ff8-1c3cef746b69No reply yetOpen paper →
PARTIALconf 0.55
Swin Transformer V2: Scaling Up Capacity and Resolution
· CVPR 2022 · cs.CV
0.9076
Evidence: job #053e2375-e645-4658-995f-7a2101e6e23fNo reply yetOpen paper →
PENDINGconf
A ConvNet for the 2020s
· CVPR 2022 · cs.CV
Evidence: job #a074c2d1-2c3a-4f13-9d8d-98e42f30355cNo reply yetOpen paper →
REPRODUCEDconf 0.80
OPT: Open Pre-trained Transformer Language Models
· arXiv 2022 · cs.CL
0.5896
Evidence: job #363402be-d180-4a57-87b7-0d584de3d263No reply yetOpen paper →
PARTIALconf 0.60
Scaling Instruction-Finetuned Language Models
· arXiv 2022 · cs.LG
0.4056
Evidence: job #edfedfec-812a-45ac-a96e-595fb4318d3eNo reply yetOpen paper →
REPRODUCEDconf 0.80
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
· arXiv 2022 · cs.CL
0.3433
Evidence: job #ef84e4f5-038d-4751-83c7-8ca4bb190a00No reply yetOpen paper →
REPRODUCEDconf 0.75
Robust Speech Recognition via Large-Scale Weak Supervision
· arXiv preprint (Whisper) · cs.CL
0.0905
Evidence: job #7d7f7cc6-84a7-4555-8b6d-b8a7ceaad6c9No reply yetOpen paper →
PARTIALconf 0.60
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
· ICML 2023 · cs.CV
0.3039
Evidence: job #b63b21bb-60a8-4ebf-ad49-cd7b6cce1b2dNo reply yetOpen paper →
PARTIALconf 0.60
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
· ICML 2023 · cs.CL
0.6166
Evidence: job #86abe450-d6a2-4e6f-9471-a34fd8a93c6bNo reply yetOpen paper →
REPRODUCEDconf 0.80
DINOv2: Learning Robust Visual Features without Supervision
· TMLR 2024 · cs.CV
0.9967
Evidence: job #a5e105cf-b60f-4476-97ef-f3abcfb79d6dNo reply yetOpen paper →
PARTIALconf 0.60
StarCoder: may the source be with you!
· arXiv 2023 · cs.CL
3.3910
Evidence: job #1b31953a-79fd-4fdf-a215-2cc810173690No reply yetOpen paper →
PENDINGconf
Llama 2: Open Foundation and Fine-Tuned Chat Models
· arXiv preprint · cs.CL
Evidence: job #2b5541bc-95cf-4d4b-b5b3-46c1f51bfe39No reply yetOpen paper →
PARTIALconf 0.60
Code Llama: Open Foundation Models for Code
· arXiv 2023 · cs.CL
1.7093
Evidence: job #8e5b0bcf-edb9-49f3-9120-b3ff5ee82535No reply yetOpen paper →
PARTIALconf 0.60
Textbooks Are All You Need II: phi-1.5 technical report
· arXiv 2023 · cs.CL
0.7129
Evidence: job #815bf082-1823-4700-89eb-058593956a8bNo reply yetOpen paper →
PARTIALconf 0.60
The Falcon Series of Open Language Models
· arXiv 2023 · cs.CL
0.7540
Evidence: job #a107ec9c-b38a-41b6-8551-26051b787d2eNo reply yetOpen paper →
PARTIALconf 0.60
TinyLlama: An Open-Source Small Language Model
· arXiv 2024 · cs.CL
0.5710
Evidence: job #8a552848-b38a-4fe9-b558-f4d6cfd8eb6aNo reply yetOpen paper →
PARTIALconf 0.60
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
· arXiv 2024 · cs.SE
2.3338
Evidence: job #320f79f8-87ac-4496-b210-ba0d8f36131cNo reply yetOpen paper →
PARTIALconf 0.60
Gemma: Open Models Based on Gemini Research and Technology
· arXiv 2024 · cs.CL
0.6760
Evidence: job #4dce19e8-ae7a-4477-bd6b-8ec4c03522b4No reply yetOpen paper →
REPRODUCEDconf 0.80
Qwen2 Technical Report
· arXiv 2024 · cs.CL
0.5175
Evidence: job #0473085c-4d11-46bd-a8f3-32e846bcbc8fNo reply yetOpen paper →