Estimating Weights for Web-Scraped Data in Consumer Price Indices
Ayoubkhani Daniel () and
Thomas Heledd ()
Additional contact information
Ayoubkhani Daniel: Office for National Statistics, Government Buildings, Cardiff Road, Newport, NP10 8XG, UK.
Thomas Heledd: Office for National Statistics, Government Buildings, Cardiff Road, Newport, NP10 8XG, UK.
Journal of Official Statistics, 2022, vol. 38, issue 1, 5-21
Abstract:
In recent years, there has been much interest among national statistical agencies in using web-scraped data in consumer price indices, potentially supplementing or replacing manually collected price quotes. Yet one challenge that has received very little attention to date is the estimation of expenditure weights in the absence of quantity information, which would enable the construction of weighted item-level price indices. In this article we propose the novel approach of predicting sales quantities from their ranks (for example, when products are sorted ‘by popularity’ on consumer websites) via appropriate statistical distributions. Using historical transactional data supplied by a UK retailer for two consumer items, we assessed the out-of-sample accuracy of the Pareto, log-normal and truncated log-normal distributions, finding that the last of these resulted in an index series that most closely approximated an expenditure-weighted benchmark. Our results demonstrate the value of supplementing web-scraped price quotes with a simple set of retailer-supplied summary statistics relating to quantities, allowing statistical agencies to realise the benefits of freely available internet data whilst placing minimal burden on retailers. However, further research would need to be undertaken before the approach could be implemented in the compilation of official price indices.
Keywords: Index numbers; price index; alternative data sources; web scraping; expenditure weights (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.2478/jos-2022-0002 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:vrs:offsta:v:38:y:2022:i:1:p:5-21:n:13
DOI: 10.2478/jos-2022-0002
Access Statistics for this article
Journal of Official Statistics is currently edited by Annica Isaksson and Ingegerd Jansson
More articles in Journal of Official Statistics from Sciendo
Bibliographic data for series maintained by Peter Golla ().