Multi-step depth enhancement refine network with multi-view stereo

Ding, Yuxuan; Li, Kefeng; Zhang, Guangyuan; Zhu, Zhenfang; Wang, Peng; Wang, Zhenfei; Fu, Chen; Li, Guangchen; Pan, Ke

Multi-step depth enhancement refine network with multi-view stereo

Yuxuan Ding, Kefeng Li, Guangyuan Zhang, Zhenfang Zhu, Peng Wang, Zhenfei Wang, Chen Fu, Guangchen Li and Ke Pan

PLOS ONE, 2025, vol. 20, issue 2, 1-17

Abstract: This paper introduces an innovative multi-view stereo matching network—the Multi-Step Depth Enhancement Refine Network (MSDER-MVS), aimed at improving the accuracy and computational efficiency of high-resolution 3D reconstruction. The MSDER-MVS network leverages the potent capabilities of modern deep learning in conjunction with the geometric intuition of traditional 3D reconstruction techniques, with a particular focus on optimizing the quality of the depth map and the efficiency of the reconstruction process.Our key innovations include a dual-branch fusion structure and a Feature Pyramid Network (FPN) to effectively extract and integrate multi-scale features. With this approach, we construct depth maps progressively from coarse to fine, continuously improving depth prediction accuracy at each refinement stage. For cost volume construction, we employ a variance-based metric to integrate information from multiple perspectives, optimizing the consistency of the estimates. Moreover, we introduce a differentiable depth optimization process that iteratively enhances the quality of depth estimation using residuals and the Jacobian matrix, without the need for additional learnable parameters. This innovation significantly increases the network’s convergence rate and the fineness of depth prediction.Extensive experiments on the standard DTU dataset (Aanas H, 2016) show that MSDER-MVS surpasses current advanced methods in accuracy, completeness, and overall performance metrics. Particularly in scenarios rich in detail, our method more precisely recovers surface details and textures, demonstrating its effectiveness and superiority for practical applications.Overall, the MSDER-MVS network offers a robust solution for precise and efficient 3D scene reconstruction. Looking forward, we aim to extend this approach to more complex environments and larger-scale datasets, further enhancing the model’s generalization and real-time processing capabilities, and promoting the widespread deployment of multi-view stereo matching technology in practical applications.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314418 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 14418&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0314418

DOI: 10.1371/journal.pone.0314418

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().