EconPapers    
Economics at your fingertips  
 

Data Science in Strategy: Machine learning and text analysis in the study of firm growth

Daan Kolkman and Arjen van Witteloostuijn
Additional contact information
Daan Kolkman: Technical University Eindhoven
Arjen van Witteloostuijn: Vrije Universiteit Amsterdam

No 19-066/VI, Tinbergen Institute Discussion Papers from Tinbergen Institute

Abstract: This study examines the applicability of modern Data Science techniques in the domain of Strategy. We apply novel techniques from the field of machine learning and text analysis. WE proceed in two steps. First, we compare different machine learning techniques to traditional regression methods in terms of their goodness-of-fit, using a dataset with 168,055 firms, only including basic demographic and financial information. The novel methods fare to three to four times better, with the random forest technique achieving the best goodness-of-fit. Second, based on 8,163 informative websites of Dutch SMEs, we construct four additional proxies for personality and strategy variables. Including our four text-analyzed variables adds about 2.5 per cent to the R2. Together, our pair of contributions provide evidence for the large potential of applying modern Data Science techniques in Strategy research. We reflect on the potential contribution of modern Data Science techniques from the perspective of the common critique that machine learning offers increased predictive accuracy at the expense of explanatory insight. Particularly, we will argue and illustrate why and how machine learning can be a productive element in the abductive theory-building cycle.

JEL-codes: L1 (search for similar items in EconPapers)
Date: 2019-09-20
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ent and nep-sbm
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://papers.tinbergen.nl/19066.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tin:wpaper:20190066

Access Statistics for this paper

More papers in Tinbergen Institute Discussion Papers from Tinbergen Institute Contact information at EDIRC.
Bibliographic data for series maintained by Tinbergen Office +31 (0)10-4088900 ().

 
Page updated 2025-04-01
Handle: RePEc:tin:wpaper:20190066