Linear RNNs Provably Learn Linear Dynamical Systems
Lifu Wang,
Tianyu Wang (),
Shengwei Yi,
Bo Shen,
Bo Hu and
Xing Cao
Additional contact information
Lifu Wang: China Information Technology Security Evaluation Center (CNITSEC)
Tianyu Wang: China Information Technology Security Evaluation Center (CNITSEC)
Shengwei Yi: China Information Technology Security Evaluation Center (CNITSEC)
Bo Shen: Beijing Jiaotong University
Bo Hu: Beijing Jiaotong University
Xing Cao: Beijing Jiaotong University
Journal of Optimization Theory and Applications, 2024, vol. 203, issue 1, No 19, 488-528
Abstract:
Abstract In this paper, we investigate the learning abilities of linear recurrent neural networks (RNNs) trained using Gradient Descent. We present a theoretical guarantee demonstrating that these linear RNNs can effectively learn any stable linear dynamical system with polynomial complexity. Importantly, our derived generalization error bound is independent of the episode length. For any stable linear system with a transition matrix C characterized by a parameter $$\rho _C$$ ρ C related to the spectral radius, we prove that despite the non-convexity of the parameter optimization loss, a linear RNN can learn the system with polynomial sample and time complexity in $$\frac{1}{1-\rho _C}$$ 1 1 - ρ C , provided that the RNN has sufficient width. Notably, the required width of the hidden layers does not depend on the length of the input sequence. This work provides the first rigorous theoretical foundation for learning linear RNNs. Our findings suggest that linear RNNs are capable of efficiently learning complex dynamical systems, paving the way for further research into the learning capabilities of more general RNN architectures.
Keywords: Non-convex optimization; Machine learning; Linear systems; Recurrent neural networks (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10957-024-02521-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:joptap:v:203:y:2024:i:1:d:10.1007_s10957-024-02521-3
Ordering information: This journal article can be ordered from
http://www.springer. ... cs/journal/10957/PS2
DOI: 10.1007/s10957-024-02521-3
Access Statistics for this article
Journal of Optimization Theory and Applications is currently edited by Franco Giannessi and David G. Hull
More articles in Journal of Optimization Theory and Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().