EconPapers    
Economics at your fingertips  
 

Topic modelling applied on innovation studies of Flemish companies

Annelien Crijns, Victor Vanhullebusch, Manon Reusens, Michael Reusens and Bart Baesens

Journal of Business Analytics, 2023, vol. 6, issue 4, 243-254

Abstract: Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions.

Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/2573234X.2023.2186274 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:tjbaxx:v:6:y:2023:i:4:p:243-254

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/tjba20

DOI: 10.1080/2573234X.2023.2186274

Access Statistics for this article

Journal of Business Analytics is currently edited by Dursan Delen

More articles in Journal of Business Analytics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:tjbaxx:v:6:y:2023:i:4:p:243-254