WEB SCRAPING AND NLP FOR PRICE PREDICTION: THE HIDDEN SIGNALS IN HR DISCOURSE
Iancu Cristina,
Ciuverca Alexandra-Cristina-Daniela and
Oprea Simona-Vasilica
Additional contact information
Iancu Cristina: DEPARTMENT OF ECONOMIC INFORMATICS AND CYBERNETICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, ROMANIA DOCTORAL SCHOOL OF ECONOMIC INFORMATICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, ROMANIA
Ciuverca Alexandra-Cristina-Daniela: DEPARTMENT OF ECONOMIC INFORMATICS AND CYBERNETICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, ROMANIA DOCTORAL SCHOOL OF ECONOMIC INFORMATICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, , ROMANIA
Oprea Simona-Vasilica: DEPARTMENT OF ECONOMIC INFORMATICS AND CYBERNETICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, ROMANIA DOCTORAL SCHOOL OF ECONOMIC INFORMATICS, BUCHAREST UNIVERSITY OF ECONOMIC STUDIES, ROMANIA
Annals - Economy Series, 2025, vol. 4, 51-74
Abstract:
This study explores the intersection between human resources discourse and macroeconomic indicators through the use of web scraping, natural language processing (NLP), and machine learning techniques. By extracting job postings from Romanian platforms and correlating them with national economic indicators, we identify hidden signals embedded in hiring trends. We aimed to demonstrate how Human Resources (HR) data can anticipate shifts in prices, salaries, and economic performance in the age of AI, willing to uncover predictive signals in the given discourse. Using large-scale web scraping from major Romanian job platforms—eJobs and BestJobs—we compiled a dataset of IT and non-IT job advertisements across multiple industries. Textual and structural features from these listings were analyzed using NLP methods including Named Entity Recognition (NER), Latent Dirichlet Allocation (LDA), BERTopic for topic modeling, and BERT-based embeddings for semantic classification. Moreover, to contextualize these insights within broader economic trends, macroeconomic indicators such as the Harmonised Index of Consumer Prices (HICP) and industry-specific productivity data were integrated. Visualization techniques such as tSNE, PCA, and heatmaps were applied to capture correlations and trends across industries and timeframes. The results indicate strong associations between the linguistic features of job postings and key economic indicators. Notably, industries with higher semantic complexity in job descriptions tended to exhibit elevated labor costs and stronger consumer price growth. These findings suggest that HR discourse can serve as a reliable, early indicator of economic developments, offering valuable insights for both workforce planning and price prediction when they are combined with relevant macroeconomic data.
Keywords: Natural Language Processing; Web Scraping; Labor Market Analysis; Job Market Signals (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.utgjiu.ro/revista/ec/pdf/2025-04/06_Iancu.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cbu:jrnlec:y:2025:v:4:p:51-74
Access Statistics for this article
More articles in Annals - Economy Series from Constantin Brancusi University, Faculty of Economics Contact information at EDIRC.
Bibliographic data for series maintained by Ecobici Nicolae ( this e-mail address is bad, please contact ).