Stream-Based Extreme Learning Machine Approach for Big Data Problems
Euler Guimarães Horta,
Cristiano Leite de Castro and
Antônio Pádua Braga
Mathematical Problems in Engineering, 2015, vol. 2015, 1-17
Abstract:
Big Data problems demand data models with abilities to handle time-varying, massive, and high dimensional data. In this context, Active Learning emerges as an attractive technique for the development of high performance models using few data. The importance of Active Learning for Big Data becomes more evident when labeling cost is high and data is presented to the learner via data streams. This paper presents a novel Active Learning method based on Extreme Learning Machines (ELMs) and Hebbian Learning. Linearization of input data by a large size ELM hidden layer turns our method little sensitive to parameter setting. Overfitting is inherently controlled via the Hebbian Learning crosstalk term. We also demonstrate that a simple convergence test can be used as an effective labeling criterion since it points out to the amount of labels necessary for learning. The proposed method has inherent properties that make it highly attractive to handle Big Data: incremental learning via data streams, elimination of redundant patterns, and learning from a reduced informative training set. Experimental results have shown that our method is competitive with some large-margin Active Learning strategies and also with a linear SVM.
Date: 2015
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/MPE/2015/126452.pdf (application/pdf)
http://downloads.hindawi.com/journals/MPE/2015/126452.xml (text/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:jnlmpe:126452
DOI: 10.1155/2015/126452
Access Statistics for this article
More articles in Mathematical Problems in Engineering from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().