EconPapers    
Economics at your fingertips  
 

OM-VST: A video action recognition model based on optimized downsampling module combined with multi-scale feature fusion

Xiaozhong Geng, Cheng Chen, Ping Yu, Baijin Liu, Weixin Hu, Qipeng Liang and Xintong Zhang

PLOS ONE, 2025, vol. 20, issue 3, 1-20

Abstract: Video classification, as an essential task in computer vision, aims to identify and label video content using computer technology automatically. However, the current mainstream video classification models face two significant challenges in practical applications: first, the classification accuracy is not high, which is mainly attributed to the complexity and diversity of video data, including factors such as subtle differences between different categories, background interference, and illumination variations; and second, the number of model training parameters is too high resulting in longer training time and increased energy consumption. To solve these problems, we propose the OM-Video Swin Transformer (OM-VST) model. This model adds a multi-scale feature fusion module with an optimized downsampling module based on a Video Swin Transformer (VST) to improve the model’s ability to perceive and characterize feature information. To verify the performance of the OM-VST model, we conducted comparison experiments between it and mainstream video classification models, such as VST, SlowFast, and TSM, on a public dataset. The results show that the accuracy of the OM-VST model is improved by 2.81% while the number of parameters is reduced by 54.7%. This improvement significantly enhances the model’s accuracy in video classification tasks and effectively reduces the number of parameters during model training.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0318884 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 18884&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0318884

DOI: 10.1371/journal.pone.0318884

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-05-10
Handle: RePEc:plo:pone00:0318884