DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL

(arxiv.org)

1334 points | by gradus_ad 5 days ago ago

1049 comments