{"paper":{"arxiv_id":"2401.14196","title":"DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence","abstract":"The rapid development of large language models has revolutionized code intelligence in software development. However, the predominance of closed-source models has restricted extensive research and development. To address this, we introduce the DeepSeek-Coder series, a range of open-source code models with sizes from 1.3B to 33B, trained from scratch on 2 trillion tokens. These models are pre-trained on a high-quality project-level code corpus and employ a fill-in-the-blank task with a 16K window to enhance code generation and infilling. Our extensive evaluations demonstrate that DeepSeek-Coder not only achieves state-of-the-art performance among open-source code models across multiple benchmarks but also surpasses existing closed-source models like Codex and GPT-3.5. Furthermore, DeepSeek-Coder models are under a permissive license that allows for both research and unrestricted commercial use.","primary_category":"cs.SE","venue":"arXiv 2024","published_at":null,"latest_version":1,"withdrawn":false},"latest_version":{"id":"af9613a2-8fd5-45b4-bbe4-0c8a80d34ae6","version":1,"source_url":"https://arxiv.org/abs/2401.14196","rendered_html_url":null,"rendering_engine":null},"verdict":{"id":"93681c20-6703-42af-9faa-14a55fc400b4","kind":"POST","status":"partial","score":2.3338256487751363,"confidence":0.6,"agent_version":"v0.1.0-deepseek-coder-pythonppl-microslice","computed_at":"2026-05-14T23:52:17.154Z","is_current":true,"claim_citation":{"paper_arxiv_id":"2401.14196","section":"Table 2","row":"DeepSeek-Coder-1.3B-Base","column":"HumanEval pass@1","reported_value":34.8,"reported_metric":"pass@1","quoted_text":"34.8","pdf_page":8,"notes":"Table 2 of arXiv:2401.14196 reports DeepSeek-Coder-1.3B-Base HumanEval pass@1 = 34.8. Driver measures Python perplexity (sanity probe), not HumanEval. PROTOCOL_MATCH = `unknown` because the metric measured differs from the metric the paper reports. Validator C1 gate prevents publication of WRONG."},"protocol_match":"unknown"},"verdicts":{"post":{"id":"93681c20-6703-42af-9faa-14a55fc400b4","kind":"POST","status":"partial","score":2.3338256487751363,"confidence":0.6,"agent_version":"v0.1.0-deepseek-coder-pythonppl-microslice","computed_at":"2026-05-14T23:52:17.154Z","is_current":true,"claim_citation":{"paper_arxiv_id":"2401.14196","section":"Table 2","row":"DeepSeek-Coder-1.3B-Base","column":"HumanEval pass@1","reported_value":34.8,"reported_metric":"pass@1","quoted_text":"34.8","pdf_page":8,"notes":"Table 2 of arXiv:2401.14196 reports DeepSeek-Coder-1.3B-Base HumanEval pass@1 = 34.8. Driver measures Python perplexity (sanity probe), not HumanEval. PROTOCOL_MATCH = `unknown` because the metric measured differs from the metric the paper reports. Validator C1 gate prevents publication of WRONG."},"protocol_match":"unknown"},"pre":null}}