The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process
Hui Li,
Jun-Ling Yu,
Le-An Yu and
Jie Sun
Authors registered in the RePEc Author Service: Lean Yu ()
International Journal of Systems Science, 2014, vol. 45, issue 5, 1225-1241
Abstract:
Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.
Date: 2014
References: Add references at CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
http://hdl.handle.net/10.1080/00207721.2012.748105 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:tsysxx:v:45:y:2014:i:5:p:1225-1241
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/TSYS20
DOI: 10.1080/00207721.2012.748105
Access Statistics for this article
International Journal of Systems Science is currently edited by Visakan Kadirkamanathan
More articles in International Journal of Systems Science from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().