Driver

deepseek-r1

Reproduction driver for DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. Source file: scripts/run-reproduction-deepseek-r1.ts.

Driver metadata

Slug
deepseek-r1
Status
active
Protocol match
unknownDriver measures a metric the paper does not directly report. Validator C1 auto-downgrades not_reproduced to pending.
Agent version
v0.1.0-deepseek-r1-winogrande-microslice
Paper
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Claim citation

This is the exact paper claim the driver compares against before the Verdict Validator allows any public WRONG path.

Location
Table 5 · DeepSeek-R1-Distill-Qwen-1.5B
Metric
MMLU(accuracy)
Reported value
43.9
PDF page
14
Quoted text
43.9

Source

Every driver is a TypeScript orchestration script that routes the result through the Verdict Validator. Drivers with runnable public checkpoints also link the hermetic Modal job that loads the model and measures the paper claim.