Generating interaction gestures in dyadic conversations using a diffusion model
Yuya Okadome,
Yazan Alkatshah and
Yutaka Nakamura
PLOS ONE, 2025, vol. 20, issue 12, 1-18
Abstract:
As expectations for computer graphic (CG) avatars and conversational robots increase, enhancing dialogue skills via multimodal channels is crucial for achieving fluent interactions with humans. Thus, automatic interaction motion generation is essential for autonomous conversation systems. Natural motion generation, such as appropriate nodding, requires considering the behavior and voice of the conversation partner. However, current models generate motion from audio or text, neglecting interaction factors. In this study, we implemented an interaction diffusion model (IDM) that uses a diffusion approach and masking features to generate interaction behaviors for dyadic conversation. IDM accounts for two participants, using masks to generate features from conditional inputs. This allows for accommodating conditions like missing features and forecasting without retraining. The experimental results suggests that the model generates the human-like behaviors during conversation in 30 ms.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339579 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 39579&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0339579
DOI: 10.1371/journal.pone.0339579
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().