EconPapers    
Economics at your fingertips  
 

Multitask Learning with Local Attention for Tibetan Speech Recognition

Hui Wang, Fei Gao, Yue Zhao, Li Yang, Jianjian Yue and Huilin Ma

Complexity, 2020, vol. 2020, 1-10

Abstract:

In this paper, we propose to incorporate the local attention in WaveNet-CTC to improve the performance of Tibetan speech recognition in multitask learning. With an increase in task number, such as simultaneous Tibetan speech content recognition, dialect identification, and speaker recognition, the accuracy rate of a single WaveNet-CTC decreases on speech recognition. Inspired by the attention mechanism, we introduce the local attention to automatically tune the weights of feature frames in a window and pay different attention on context information for multitask learning. The experimental results show that our method improves the accuracies of speech recognition for all Tibetan dialects in three-task learning, compared with the baseline model. Furthermore, our method significantly improves the accuracy for low-resource dialect by 5.11% against the specific-dialect model.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://downloads.hindawi.com/journals/8503/2020/8894566.pdf (application/pdf)
http://downloads.hindawi.com/journals/8503/2020/8894566.xml (text/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:8894566

DOI: 10.1155/2020/8894566

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem (mohamed.abdelhakeem@hindawi.com).

 
Page updated 2025-03-19
Handle: RePEc:hin:complx:8894566