EconPapers    
Economics at your fingertips  
 

Integrative Classification Using Structural Equation Modeling of Homeostasis

Hong-Bin Fang (), Hengzhen Huang (), Ao Yuan (), Ruzong Fan () and Ming T. Tan ()
Additional contact information
Hong-Bin Fang: Georgetown University Medical Center
Hengzhen Huang: Guangxi Normal University
Ao Yuan: Georgetown University Medical Center
Ruzong Fan: Georgetown University Medical Center
Ming T. Tan: Georgetown University Medical Center

Statistics in Biosciences, 2024, vol. 16, issue 3, No 10, 742-760

Abstract: Abstract We consider binary classification in the high-dimensional setting, where the number of features is huge, and the number of observations is limited. We focus on the setting where features in one group have certain correlation structures that are not present in the other group. This is particularly relevant in early detection of diseases where subjects develop from a normal or homeostatic state to a diseased condition. Linear discriminant analysis (with a link function) and classification based on regularized regression or machine learning have been used as methods for this problem and related variable selection. However, most methods do not account for the correlation structures of variables within groups. While the diseased group may demonstrate abundant diversity and no clear structure, achieving higher accuracy in classification requires considering the correlation structures in the control group with homeostasis. In this paper, we develop a structural equation modeling approach to characterize the correlation structures of homeostasis, and the parameters are estimated using only the data from one group. The structural equation models are not applicable to the data from the other group, and the classification specificity and sensitivity are determined by choosing the confidence intervals of the estimated parameters. We use a real multi-platform genomics dataset to illustrate the methods, and we demonstrate that our approach performs well compared to statistical learning methods such as regularized logistic regression models.

Keywords: Cancer diagnosis; Classification; Covariance structure; Integrative analysis; Genomic data; Structural equation model (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s12561-024-09418-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stabio:v:16:y:2024:i:3:d:10.1007_s12561-024-09418-9

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/12561

DOI: 10.1007/s12561-024-09418-9

Access Statistics for this article

Statistics in Biosciences is currently edited by Hongyu Zhao and Xihong Lin

More articles in Statistics in Biosciences from Springer, International Chinese Statistical Association
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:stabio:v:16:y:2024:i:3:d:10.1007_s12561-024-09418-9