Performance Analysis of Statistical and Supervised Learning Techniques in Stock Data Mining

Sharma, Manik; Sharma, Samriti; Singh, Gurvinder

Performance Analysis of Statistical and Supervised Learning Techniques in Stock Data Mining

Manik Sharma, Samriti Sharma and Gurvinder Singh
Additional contact information
Manik Sharma: Department of Computer Science and Applications, DAV University, Jalandhar 144401, India
Samriti Sharma: Department of Computer Science, Guru Nanak Dev University, Amritar 143001, India
Gurvinder Singh: Department of Computer Science, Guru Nanak Dev University, Amritar 143001, India

Data, 2018, vol. 3, issue 4, 1-16

Abstract: Nowadays, overwhelming stock data is available, which areonly of use if it is properly examined and mined. In this paper, the last twelve years of ICICI Bank’s stock data have been extensively examined using statistical and supervised learning techniques. This study may be of great interest for those who wish to mine or study the stock data of banks or any financial organization. Different statistical measures have been computed to explore the nature, range, distribution, and deviation of data. The different descriptive statistical measures assist in finding different valuable metrics such as mean, variance, skewness, kurtosis, p -value, a-squared, and 95% confidence mean interval level of ICICI Bank’s stock data. Moreover, daily percentage changes occurring over the last 12 years have also been recorded and examined. Additionally, the intraday stock status has been mined using ten different classifiers. The performance of different classifiers has been evaluated on the basis of various parameters such as accuracy, misclassification rate, precision, recall, specificity, and sensitivity. Based upon different parameters, the predictive results obtained using logistic regression are more acceptable than the outcomes of other classifiers, whereas naïve Bayes, C4.5, random forest, linear discriminant, and cubic support vector machine (SVM) merely act as a random guessing machine. The outstanding performance of logistic regression has been validated using TOPSIS (technique for order preference by similarity to ideal solution) and WSA (weighted sum approach).

Keywords: stock forecasting; naïve Bayes; C4.5; random forest; logistic regression; support vector machine (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2306-5729/3/4/54/pdf (application/pdf)
https://www.mdpi.com/2306-5729/3/4/54/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:3:y:2018:i:4:p:54-:d:185259

Access Statistics for this article

Data is currently edited by Ms. Becky Zhang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().