Phonetic acquisition in cortical dynamics, a computational approach

Dematties, Dario; Rizzi, Silvio; Thiruvathukal, George K; Wainselboim, Alejandro; Zanutto, B Silvano

Phonetic acquisition in cortical dynamics, a computational approach

Dario Dematties, Silvio Rizzi, George K Thiruvathukal, Alejandro Wainselboim and B Silvano Zanutto

PLOS ONE, 2019, vol. 14, issue 6, 1-28

Abstract: Many computational theories have been developed to improve artificial phonetic classification performance from linguistic auditory streams. However, less attention has been given to psycholinguistic data and neurophysiological features recently found in cortical tissue. We focus on a context in which basic linguistic units–such as phonemes–are extracted and robustly classified by humans and other animals from complex acoustic streams in speech data. We are especially motivated by the fact that 8-month-old human infants can accomplish segmentation of words from fluent audio streams based exclusively on the statistical relationships between neighboring speech sounds without any kind of supervision. In this paper, we introduce a biologically inspired and fully unsupervised neurocomputational approach that incorporates key neurophysiological and anatomical cortical properties, including columnar organization, spontaneous micro-columnar formation, adaptation to contextual activations and Sparse Distributed Representations (SDRs) produced by means of partial N-Methyl-D-aspartic acid (NMDA) depolarization. Its feature abstraction capabilities show promising phonetic invariance and generalization attributes. Our model improves the performance of a Support Vector Machine (SVM) classifier for monosyllabic, disyllabic and trisyllabic word classification tasks in the presence of environmental disturbances such as white noise, reverberation, and pitch and voice variations. Furthermore, our approach emphasizes potential self-organizing cortical principles achieving improvement without any kind of optimization guidance which could minimize hypothetical loss functions by means of–for example–backpropagation. Thus, our computational model outperforms multiresolution spectro-temporal auditory feature representations using only the statistical sequential structure immerse in the phonotactic rules of the input stream.

Date: 2019
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0217966 (text/html)
https://journals.plos.org/plosone/article?id=10.13 ... 17966&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0217966

DOI: 10.1371/journal.pone.0217966

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().