EconPapers    
Economics at your fingertips  
 

The Impact of Alternative Data on Default Probability: Analyzing the Italian E-commerce Sector with NLP and Network Structures

Brian Daniel Bernhardt, Chiara Marciano () and Mario Rosario Guarracino
Additional contact information
Brian Daniel Bernhardt: University of Cassino and Southern Lazio
Chiara Marciano: University of Cassino and Southern Lazio
Mario Rosario Guarracino: University of Cassino and Southern Lazio

SN Operations Research Forum, 2025, vol. 6, issue 2, 1-30

Abstract: Abstract E-commerce is a key sector in the Italian economy, with online companies becoming some of the largest and most profitable businesses. However, this growth comes with increased risk exposure. This study aims to investigate the relationship between alternative data (contextual factors, Text-Driven Data Enrichment) and the probability of default for Italian e-commerce companies. To date, no studies have examined how these alternative data affect the default probability within this sector. To address this gap, the ongoing research analyzes a dataset of Italian companies, focusing on sector-specific indicators. In the Italian e-commerce market, companies are identified by a unique code that indicates the sales channel but not the types of products marketed. To overcome this limitation, a natural language processing (NLP) model is applied to the es of these companies, allowing us to identify the types of goods or services sold by each company. These enriched data provide a more comprehensive understanding and serve as the foundation for a classification model that uses standard interpretable algorithms for default prediction. Additionally, network structures identified by companies’ similarities are used to extract new insights, which can support decision-making processes for stakeholders at different levels of the supply chain. The model’s performance is evaluated in various scenarios, comparing results with and without the inclusion of alternative data. Key performance metrics are analyzed to demonstrate how integrating alternative data enhances default prediction models.

Keywords: Text embedding; Supervised learning; Probability of default; E-commerce sector; Alternative data (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s43069-025-00442-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:snopef:v:6:y:2025:i:2:d:10.1007_s43069-025-00442-z

Ordering information: This journal article can be ordered from
https://www.springer.com/journal/43069

DOI: 10.1007/s43069-025-00442-z

Access Statistics for this article

SN Operations Research Forum is currently edited by Marco Lübbecke

More articles in SN Operations Research Forum from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-09
Handle: RePEc:spr:snopef:v:6:y:2025:i:2:d:10.1007_s43069-025-00442-z