Neuroprosthetic Decoder Training as Imitation Learning

Merel, Josh; Carlson, David; Paninski, Liam; Cunningham, John P

Neuroprosthetic Decoder Training as Imitation Learning

Josh Merel, David Carlson, Liam Paninski and John P Cunningham

PLOS Computational Biology, 2016, vol. 12, issue 5, 1-24

Abstract: Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user’s intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user’s intended movement. Here we show that training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.Author Summary: There are various existing methods for rapidly learning a decoder during closed-loop brain computer interface (BCI) tasks. While many of these methods work well in practice, there is no clear theoretical foundation for parameter learning. We offer a unification of closed-loop decoder learning setting as an imitation learning problem. This has two major consequences: first, our approach clarifies how to derive “intention-based” algorithms for any BCI setting, most notably more complex settings like control of an arm; and second, this framework allows us to provide theoretical results, building from an existing literature on the regret of related algorithms. After first demonstrating algorithmic performance in simulation on the well-studied setting of a user trying to reach targets by controlling a cursor on a screen, we then simulate a user controlling an arm with many degrees of freedom in order to grasp a wand. Finally, we describe how extensions in the online-imitation learning literature can improve BCI in additional settings.

Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004948 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 04948&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1004948

DOI: 10.1371/journal.pcbi.1004948

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().