EconPapers    
Economics at your fingertips  
 

An Adaptive Method Based on Multiscale Dilated Convolutional Network for Binaural Speech Source Localization

Lulu Wu, Hong Liu, Bing Yang and Runwei Ding

Complexity, 2020, vol. 2020, 1-7

Abstract:

Most binaural speech source localization models perform poorly in unprecedentedly noisy and reverberant situations. Here, this issue is approached by modelling a multiscale dilated convolutional neural network (CNN). The time-related crosscorrelation function (CCF) and energy-related interaural level differences (ILD) are preprocessed in separate branches of dilated convolutional network. The multiscale dilated CNN can encode discriminative representations for CCF and ILD, respectively. After encoding, the individual interaural representations are fused to map source direction. Furthermore, in order to improve the parameter adaptation, a novel semiadaptive entropy is proposed to train the network under directional constraints. Experimental results show the proposed method can adaptively locate speech sources in simulated noisy and reverberant environments.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://downloads.hindawi.com/journals/8503/2020/5819624.pdf (application/pdf)
http://downloads.hindawi.com/journals/8503/2020/5819624.xml (text/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:5819624

DOI: 10.1155/2020/5819624

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem (mohamed.abdelhakeem@hindawi.com).

 
Page updated 2025-03-19
Handle: RePEc:hin:complx:5819624