文章目录 Abstract1. Introduction2. Approach2.1. Overview2.2. DeepSeek-R1-Zero: Reinforcement Learning on the Base Model2.3. DeepSeek-R1: Reinforcement Learning with Cold Start2.4. Distillation: Empower Small Models with Reasoning Capability 3. Experiment3.…