An overview of gradient descent optimization algorithms (2016)

(ruder.io)

132 points | by skidrow 8 days ago ago

28 comments