A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis
Yuanhong Li,
Jing Wang,
Ming Liang,
Haoyu Song,
Jianhong Liao and
Yubin Lan ()
Additional contact information
Yuanhong Li: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Jing Wang: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Ming Liang: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Haoyu Song: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Jianhong Liao: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Yubin Lan: College of Electronic Engineering (College of Artificial Intelligence), South China Agricultural University, Guangzhou 510642, China
Agriculture, 2024, vol. 14, issue 7, 1-23
Abstract:
Obtaining consistent multi-view images of litchis is crucial for various litchi-related studies, such as data augmentation and 3D reconstruction. This paper proposes a two-stage model that integrates the Mask2Former semantic segmentation network with the Wonder3D multi-view generation network. This integration aims to accurately segment and extract litchis from complex backgrounds and generate consistent multi-view images of previously unseen litchis. In the first stage, the Mask2Former model is utilized to predict litchi masks, enabling the extraction of litchis from complex backgrounds. To further enhance the accuracy of litchi branch extraction, we propose a novel method that combines the predicted masks with morphological operations and the HSV color space. This approach ensures accurate extraction of litchi branches even when the semantic segmentation model’s prediction accuracy is not high. In the second stage, the segmented and extracted litchi images are passed as input into the Wonder3D network to generate multi-view of the litchis. After comparing different semantic segmentation and multi-view synthesis networks, the Mask2Former and Wonder3D networks demonstrated the best performance. The Mask2Former network achieved a mean Intersection over Union (mIoU) of 79.79% and a mean pixel accuracy (mPA) of 85.82%. The Wonder3D network achieved a peak signal-to-noise ratio (PSNR) of 18.89 dB, a structural similarity index (SSIM) of 0.8199, and a learned perceptual image patch similarity (LPIPS) of 0.114. Combining the Mask2Former model with the Wonder3D network resulted in an increase in PSNR and SSIM scores by 0.21 dB and 0.0121, respectively, and a decrease in LPIPS by 0.064 compared to using the Wonder3D model alone. Therefore, the proposed two-stage model effectively achieves automatic extraction and multi-view generation of litchis with high accuracy.
Keywords: litchi; litchi branches; semantic segmentation; multi-view generation; two-stage model (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2077-0472/14/7/1046/pdf (application/pdf)
https://www.mdpi.com/2077-0472/14/7/1046/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:14:y:2024:i:7:p:1046-:d:1425701
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().