Economics at your fingertips  

Institutional sector classifier, a machine learning approach

Paolo Massaro (), Ilaria Vannini () and Oliver Giudice ()
Additional contact information
Paolo Massaro: Bank of Italy
Ilaria Vannini: Bank of Italy
Oliver Giudice: Bank of Italy

No 548, Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area

Abstract: We implement machine learning techniques to obtain an automatic classification by sector of economic activity of the Italian companies recorded in the Bank of Italy Entities Register. To this end, first we extract a sample of correctly classified corporations from the universe of Italian companies. Second, we select a set of features that are related to the sector of economic activity code and use these to implement supervised approaches to infer output predictions. We choose a multi-step approach based on the hierarchical structure of the sector classification. Because of the imbalance in the target classes, at each step, we first apply two resampling procedures – random oversampling and the Synthetic Minority Over-sampling Technique – to get a more balanced training set. Then, we fit Gradient Boosting and Support Vector Machine models. Overall, the performance of our multi-step classifier yields very reliable predictions of the sector code. This approach can be employed to make the whole classification process more efficient by reducing the area of manual intervention.

Keywords: machine learning; entities register; classification by institutional sector (search for similar items in EconPapers)
JEL-codes: C18 C81 G21 (search for similar items in EconPapers)
Date: 2020-03
New Economics Papers: this item is included in nep-big and nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link) (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Access Statistics for this paper

More papers in Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area Contact information at EDIRC.
Bibliographic data for series maintained by ().

Page updated 2020-12-05
Handle: RePEc:bdi:opques:qef_548_20