Faster sorting algorithms discovered using deep reinforcement learning

Mankowitz, Daniel J.; Michi, Andrea; Zhernov, Anton; Gelmi, Marco; Selvi, Marco; Paduraru, Cosmin; Leurent, Edouard; Iqbal, Shariq; Lespiau, Jean-Baptiste; Ahern, Alex; Köppe, Thomas; Millikin, Kevin; Gaffney, Stephen; Elster, Sophie; Broshear, Jackson; Gamble, Chris; Milan, Kieran; Tung, Robert; Hwang, Minjae; Cemgil, Taylan; Barekatain, Mohammadamin; Li, Yujia; Mandhane, Amol; Hubert, Thomas; Schrittwieser, Julian; Hassabis, Demis; Kohli, Pushmeet; Riedmiller, Martin; Vinyals, Oriol; Silver, David

Faster sorting algorithms discovered using deep reinforcement learning

Daniel J. Mankowitz (), Andrea Michi, Anton Zhernov, Marco Gelmi, Marco Selvi, Cosmin Paduraru, Edouard Leurent, Shariq Iqbal, Jean-Baptiste Lespiau, Alex Ahern, Thomas Köppe, Kevin Millikin, Stephen Gaffney, Sophie Elster, Jackson Broshear, Chris Gamble, Kieran Milan, Robert Tung, Minjae Hwang, Taylan Cemgil, Mohammadamin Barekatain, Yujia Li, Amol Mandhane, Thomas Hubert, Julian Schrittwieser, Demis Hassabis, Pushmeet Kohli, Martin Riedmiller, Oriol Vinyals and David Silver
Additional contact information
Daniel J. Mankowitz: Deepmind
Andrea Michi: Deepmind
Anton Zhernov: Deepmind
Marco Gelmi: Deepmind
Marco Selvi: Deepmind
Cosmin Paduraru: Deepmind
Edouard Leurent: Deepmind
Shariq Iqbal: Deepmind
Jean-Baptiste Lespiau: Deepmind
Alex Ahern: Deepmind
Thomas Köppe: Deepmind
Kevin Millikin: Deepmind
Stephen Gaffney: Deepmind
Sophie Elster: Deepmind
Jackson Broshear: Deepmind
Chris Gamble: Deepmind
Kieran Milan: Deepmind
Robert Tung: Deepmind
Minjae Hwang: Google
Taylan Cemgil: Deepmind
Mohammadamin Barekatain: Deepmind
Yujia Li: Deepmind
Amol Mandhane: Deepmind
Thomas Hubert: Deepmind
Julian Schrittwieser: Deepmind
Demis Hassabis: Deepmind
Pushmeet Kohli: Deepmind
Martin Riedmiller: Deepmind
Oriol Vinyals: Deepmind
David Silver: Deepmind

Nature, 2023, vol. 618, issue 7964, 257-263

Abstract: Abstract Fundamental algorithms such as sorting or hashing are used trillions of times on any given day1. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past2, making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library3. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.

Date: 2023
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.nature.com/articles/s41586-023-06004-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:618:y:2023:i:7964:d:10.1038_s41586-023-06004-9

Ordering information: This journal article can be ordered from
https://www.nature.com/

DOI: 10.1038/s41586-023-06004-9

Access Statistics for this article

Nature is currently edited by Magdalena Skipper

More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().