The effects of face mask on speech production and its implication for forensic speaker identification-A cross-linguistic study
Puyang Geng,
Qimeng Lu,
Hong Guo and
Jinhua Zeng
PLOS ONE, 2023, vol. 18, issue 3, 1-18
Abstract:
This study aims to understand the effects of face mask on speech production between Mandarin Chinese and English, and on the automatic classification of mask/no mask speech and individual speakers. A cross-linguistic study on mask speech between Mandarin Chinese and English was then conducted. Continuous speech of the phonetically balanced texts in both Chinese and English versions were recorded from thirty native speakers of Mandarin Chinese (i.e., 15 males and 15 females) with and without wearing a surgical mask. The results of acoustic analyses showed that mask speech exhibited higher F0, intensity, HNR, and lower jitter and shimmer than no mask speech for Mandarin Chinese, whereas higher HNR and lower jitter and shimmer were observed for English mask speech. The results of classification analyses showed that, based on the four supervised learning algorithms (i.e., Linear Discriminant Analysis, Naïve Bayes Classifier, Random Forest, and Support Vector Machine), undesirable performances (i.e., lower than 50%) in classifying the speech with and without a face mask, and highly-variable accuracies (i.e., ranging from 40% to 89.2%) in identifying individual speakers were achieved. These findings imply that the speakers tend to conduct acoustic adjustments to improve their speech intelligibility when wearing surgical mask. However, a cross-linguistic difference in speech strategies to compensate for intelligibility was observed that Mandarin speech was produced with higher F0, intensity, and HNR, while English was produced with higher HNR. Besides, the highly-variable accuracies of speaker identification might suggest that surgical mask would impact the general performance of the accuracy of automatic speaker recognition. In general, therefore, it seems wearing a surgical mask would impact both acoustic-phonetic and automatic speaker recognition approaches to some extent, thus suggesting particular cautions in the real-case practice of forensic speaker identification.
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0283724 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 83724&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0283724
DOI: 10.1371/journal.pone.0283724
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().