FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI

(epochai.org)

183 points | by sshroot 5 days ago ago

No comments yet.