Deep Learning in Multimodal Fusion for Sustainable Plant Care: A Comprehensive Review

Yang, Zhi-Xiang; Li, Yusi; Wang, Rui-Feng; Hu, Pingfan; Su, Wen-Hao

Deep Learning in Multimodal Fusion for Sustainable Plant Care: A Comprehensive Review

Zhi-Xiang Yang, Yusi Li, Rui-Feng Wang, Pingfan Hu () and Wen-Hao Su ()
Additional contact information
Zhi-Xiang Yang: China Agricultural University, Qinghua East Road No. 17, Haidian, Beijing 100083, China
Yusi Li: China Agricultural University, Qinghua East Road No. 17, Haidian, Beijing 100083, China
Rui-Feng Wang: China Agricultural University, Qinghua East Road No. 17, Haidian, Beijing 100083, China
Pingfan Hu: Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, TX 77843-3122, USA
Wen-Hao Su: China Agricultural University, Qinghua East Road No. 17, Haidian, Beijing 100083, China

Sustainability, 2025, vol. 17, issue 12, 1-33

Abstract: With the advancement of Agriculture 4.0 and the ongoing transition toward sustainable and intelligent agricultural systems, deep learning-based multimodal fusion technologies have emerged as a driving force for crop monitoring, plant management, and resource conservation. This article systematically reviews research progress from three perspectives: technical frameworks, application scenarios, and sustainability-driven challenges. At the technical framework level, it outlines an integrated system encompassing data acquisition, feature fusion, and decision optimization, thereby covering the full pipeline of perception, analysis, and decision making essential for sustainable practices. Regarding application scenarios, it focuses on three major tasks—disease diagnosis, maturity and yield prediction, and weed identification—evaluating how deep learning-driven multisource data integration enhances precision and efficiency in sustainable farming operations. It further discusses the efficient translation of detection outcomes into eco-friendly field practices through agricultural navigation systems, harvesting and plant protection robots, and intelligent resource management strategies based on feedback-driven monitoring. In addressing challenges and future directions, the article highlights key bottlenecks such as data heterogeneity, real-time processing limitations, and insufficient model generalization, and proposes potential solutions including cross-modal generative models and federated learning to support more resilient, sustainable agricultural systems. This work offers a comprehensive three-dimensional analysis across technology, application, and sustainability challenges, providing theoretical insights and practical guidance for the intelligent and sustainable transformation of modern agriculture through multimodal fusion.

Keywords: multimodal fusion; crop monitoring and management; agricultural machinery; deep learning; sustainable agriculture (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.mdpi.com/2071-1050/17/12/5255/pdf (application/pdf)
https://www.mdpi.com/2071-1050/17/12/5255/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:17:y:2025:i:12:p:5255-:d:1673581

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().