EconPapers    
Economics at your fingertips  
 

Empirically calibrated simulations reveal the limits of phenotypic clustering algorithms for biodiversity assessment in data-scarce crops

Abdel Kader Naino Jika

PLOS ONE, 2025, vol. 20, issue 12, 1-14

Abstract: Clustering algorithms are widely used for phenotypic characterization and germplasm management, particularly in data-scarce crops such as neglected and underutilized species (NUS) that lack genomic resources. However, their performance under biologically realistic conditions remains poorly understood. Standard clustering methods commonly applied in crop research often assume distinct, isotropic, and homogeneous clusters, assumptions rarely satisfied in real-world phenotypic datasets. We developed a flexible and empirically calibrated simulation framework, using phenotypic data from West African fonio (Digitaria exilis), to benchmark the performance of eleven clustering algorithms under both idealized and realistic scenarios. Our simulations integrated heterogeneous trait distributions (normal, gamma), strong inter-trait correlations (up to r = –0.84), heteroscedasticity, and moderate population structure (mean Pst = 0.16 ± 0.001, achieved through iterative calibration). Each scenario was replicated 100 times, with clustering accuracy evaluated using external (ARI, NMI) and internal (Silhouette, Davies–Bouldin) validation metrics under standardized conditions. The results revealed consistently poor algorithm performance under realistic conditions (e.g., ARI

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0329254 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 29254&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0329254

DOI: 10.1371/journal.pone.0329254

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-12-21
Handle: RePEc:plo:pone00:0329254