Parallel Dense Video Caption Generation with Multi-Modal Features
Xuefei Huang,
Ka-Hou Chan,
Wei Ke () and
Hao Sheng
Additional contact information
Xuefei Huang: Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
Ka-Hou Chan: Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
Wei Ke: Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
Hao Sheng: Faculty of Applied Sciences, Macao Polytechnic University, Macau 999078, China
Mathematics, 2023, vol. 11, issue 17, 1-16
Abstract:
The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame sequences, resulting in caption generation that is highly dependent on which objects have been detected. This work proposes a parallel-based dense video captioning method that can simultaneously address the mutual constraint between event proposals and captions. Additionally, a deformable Transformer framework is introduced to reduce or free manual threshold of hyperparameters in such methods. An information transfer station is also added as a representation organisation, which receives the hidden features extracted from a frame and implicitly generates multiple event proposals. The proposed method also adopts LSTM (Long short-term memory) with deformable attention as the main layer for caption generation. Experimental results show that the proposed method outperforms other methods in this area to a certain degree on the ActivityNet Caption dataset, providing competitive results.
Keywords: dense video caption; video captioning; multimodal feature fusion; feature extraction; neural network (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/17/3685/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/17/3685/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:17:p:3685-:d:1226113
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().