EconPapers    
Economics at your fingertips  
 

Exploring pathways to comprehension performance in multilanguage smart voice systems: insights from Lasso regression, SEM, PLS-SEM, CNN, and BiLSTM

Entong Gao, Jialu Guo, Xipeng Pang, Danya Bo and Zhe Chen ()
Additional contact information
Entong Gao: Xueyuan Road No. 37
Jialu Guo: Xueyuan Road No. 37
Xipeng Pang: Xueyuan Road No. 37
Danya Bo: Xueyuan Road No. 37
Zhe Chen: Xueyuan Road No. 37

Palgrave Communications, 2024, vol. 11, issue 1, 1-20

Abstract: Abstract Smart voice systems, such as voice assistants and smart speakers, are integral to domains such as smart homes, customer service, healthcare, and smart learning. The effectiveness of these systems relies on user comprehension performance, which is crucial for enhancing user experience. In this study, the primary factors influencing comprehension performance in multilanguage smart voice systems are examined, and the efficacy of various analytical methods, including LASSO regression, SEM, PLS-SEM, CNN, and BiLSTM, are assessed by identifying and improving these factors. Using a diverse dataset from human–computer interaction experiments made publicly available on GitHub, these five methods are applied to discern the impact of environmental and user-specific factors on comprehension. The key findings indicate the following: 1) Noise types and noise sound levels markedly affect comprehension. Noise sound level exhibited an inverted U-shape curve (parameter: 0.088) due to the low and high levels of noise. Certain rhythmic noises, such as those from clocks (parameter: 0.033), enhance comprehension by fostering a conducive auditory environment. 2) Analytical method comparisons reveal that while LASSO regression (MSE = 0.026), SEM, and PLS-SEM effectively map the linear relationships and pathways affecting comprehension, deep learning approaches such as CNN and BiLSTM (MSE = 0.019) excel at handling complex, multidimensional data, offering superior predictive performance.3) In a non-native language environment, the evaluation of user comprehension models is notably different from that in native language settings (native R2: 0.545; non-native R2: 0.347). Specifically, in non-native language environments, the variables and mechanisms influencing user comprehension models are clearer, more controllable, and more susceptible to proficiency levels (parameter: 0.164). This comprehensive study presents a novel comparison of traditional statistical and machine learning methods in analyzing smart voice system interaction across languages. These findings emphasize the significance of tailoring smart voice systems to user diversity in language proficiency, age, and educational background and suggest optimizing these systems under varied environmental conditions to improve comprehension and overall effectiveness. The insights from this study are critical for policymakers and designers aiming to refine the adaptability and user-centric nature of smart voice systems.

Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1057/s41599-024-04025-x Abstract (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:pal:palcom:v:11:y:2024:i:1:d:10.1057_s41599-024-04025-x

Ordering information: This journal article can be ordered from
https://www.nature.com/palcomms/about

DOI: 10.1057/s41599-024-04025-x

Access Statistics for this article

More articles in Palgrave Communications from Palgrave Macmillan
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:pal:palcom:v:11:y:2024:i:1:d:10.1057_s41599-024-04025-x