DAS-Net: A Dual-Attention Synergistic Network with Triple-Spatial and Multi-Scale Temporal Modeling for Dairy Cow Feeding Behavior Detection
Xuwen Li,
Ronghua Gao (),
Qifeng Li,
Rong Wang,
Luyu Ding,
Pengfei Ma,
Xiaohan Yang and
Xinxin Ding
Additional contact information
Xuwen Li: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Ronghua Gao: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Qifeng Li: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Rong Wang: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Luyu Ding: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Pengfei Ma: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Xiaohan Yang: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Xinxin Ding: Research Center of Information Technology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Agriculture, 2025, vol. 15, issue 17, 1-24
Abstract:
The feeding behavior of dairy cows constitutes a complex temporal sequence comprising actions such as head lowering, sniffing, arching, eating, head raising, and chewing. Its precise recognition is crucial for refined livestock management. While existing 2D convolution-based models effectively extract features from individual frames, they lack temporal modeling capabilities. Conversely, due to their high computational complexity, 3D convolutional networks suffer from significantly limited recognition accuracy in high-density feeding scenarios. To address this, this paper proposes a Spatio-Temporal Fusion Network (DAS-Net): it designs a collaborative architecture featuring a 2D branch with a triple-attention module to enhance spatial key feature extraction, constructs a 3D branch based on multi-branch dilated convolution and integrates a 3D multi-scale attention mechanism to achieve efficient long-term temporal modeling. On our Spatio-Temporal Dairy Feeding Dataset (STDF Dataset), which contains 403 video clips and 10,478 annotated frames across seven behavior categories, the model achieves an average recognition accuracy of 56.83% for all action types. This result marks a significant improvement of 3.61 percentage points over the original model. Among them, the recognition accuracy of the eating action has been increased to 94.78%. This method provides a new idea for recognizing dairy cow feeding behavior and can provide technical support for developing intelligent feeding systems in real dairy farms.
Keywords: dairy cow feeding behavior; spatio-temporal action detection; temporal modeling; dual-branch network (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2077-0472/15/17/1903/pdf (application/pdf)
https://www.mdpi.com/2077-0472/15/17/1903/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:15:y:2025:i:17:p:1903-:d:1744633
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().