Outracing champion Gran Turismo drivers with deep reinforcement learning
Peter R. Wurman (),
Samuel Barrett,
Kenta Kawamoto,
James MacGlashan,
Kaushik Subramanian,
Thomas J. Walsh,
Roberto Capobianco,
Alisa Devlic,
Franziska Eckert,
Florian Fuchs,
Leilani Gilpin,
Piyush Khandelwal,
Varun Kompella,
HaoChih Lin,
Patrick MacAlpine,
Declan Oller,
Takuma Seno,
Craig Sherstan,
Michael D. Thomure,
Houmehr Aghabozorgi,
Leon Barrett,
Rory Douglas,
Dion Whitehead,
Peter Dürr,
Peter Stone,
Michael Spranger and
Hiroaki Kitano
Additional contact information
Peter R. Wurman: Sony AI
Samuel Barrett: Sony AI
Kenta Kawamoto: Sony AI
James MacGlashan: Sony AI
Kaushik Subramanian: Sony AI
Thomas J. Walsh: Sony AI
Roberto Capobianco: Sony AI
Alisa Devlic: Sony AI
Franziska Eckert: Sony AI
Florian Fuchs: Sony AI
Leilani Gilpin: Sony AI
Piyush Khandelwal: Sony AI
Varun Kompella: Sony AI
HaoChih Lin: Sony AI
Patrick MacAlpine: Sony AI
Declan Oller: Sony AI
Takuma Seno: Sony AI
Craig Sherstan: Sony AI
Michael D. Thomure: Sony AI
Houmehr Aghabozorgi: Sony AI
Leon Barrett: Sony AI
Rory Douglas: Sony AI
Dion Whitehead: Sony AI
Peter Dürr: Sony AI
Peter Stone: Sony AI
Michael Spranger: Sony AI
Hiroaki Kitano: Sony AI
Nature, 2022, vol. 602, issue 7896, 223-228
Abstract:
Abstract Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits1. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world’s best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing’s important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world’s best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.
Date: 2022
References: Add references at CitEc
Citations: View citations in EconPapers (9)
Downloads: (external link)
https://www.nature.com/articles/s41586-021-04357-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:602:y:2022:i:7896:d:10.1038_s41586-021-04357-7
Ordering information: This journal article can be ordered from
https://www.nature.com/
DOI: 10.1038/s41586-021-04357-7
Access Statistics for this article
Nature is currently edited by Magdalena Skipper
More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().