EconPapers    
Economics at your fingertips  
 

On the Validity of Using Webpage Texts to Identify the Target Population of a Survey: An Application to Detect Online Platforms

Piet Daas (), Wolter Hassink () and Bart Klijs ()
Additional contact information
Piet Daas: Eindhoven University of Technology
Wolter Hassink: Utrecht University
Bart Klijs: Statistics Netherlands

No 15941, IZA Discussion Papers from Institute of Labor Economics (IZA)

Abstract: A statistical classification model was developed to identify online platform organizations based on the texts on their website. The model was subsequently used to identify all (potential) platform organizations with a website included in the Dutch Business Register. The empirical outcomes of the statistical model were plausible in terms of the words and the bimodal distribution of fitted probabilities, but the results indicated an overestimation of the number of platform organizations. Next, the external validity of the outcomes was investigated through a survey held under the organizations that were identified as a platform organization by the statistical classification model. The response by the organizations to the survey confirmed a substantial number of type-I errors. Furthermore, it revealed a positive association between the fitted probability of the text-based classification model and the organization's response to the survey question on being an online platform organization. The survey results indicated that the text-based classification model can be used to obtain a subpopulation of potential platform organizations from the entire population of businesses with a website.

Keywords: online platform organizations; external validation; type-I error; machine learning; web pages (search for similar items in EconPapers)
JEL-codes: C81 C83 D20 D83 L20 (search for similar items in EconPapers)
Pages: 32 pages
Date: 2023-02
New Economics Papers: this item is included in nep-big, nep-cmp and nep-pay
References: View references in EconPapers View complete reference list from CitEc
Citations:

Published - published in: Journal of Official Statistics, 2024, 40 (1), 190-211

Downloads: (external link)
https://docs.iza.org/dp15941.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:iza:izadps:dp15941

Ordering information: This working paper can be ordered from
IZA, Margard Ody, P.O. Box 7240, D-53072 Bonn, Germany

Access Statistics for this paper

More papers in IZA Discussion Papers from Institute of Labor Economics (IZA) IZA, P.O. Box 7240, D-53072 Bonn, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Holger Hinte ().

 
Page updated 2025-03-30
Handle: RePEc:iza:izadps:dp15941