EconPapers    
Economics at your fingertips  
 

Precision Without Labels: Detecting Cross-Applicants in Mortgage Data Using Unsupervised Learning

Hadi Elzayn (), Simon Freyaldenhoven and Minchul Shin

No 25-25, Working Papers from Federal Reserve Bank of Philadelphia

Abstract: We develop a clustering-based algorithm to detect loan applicants who submit multiple applications (“cross-applicants”) in a loan-level dataset without personal identifiers. A key innovation of our approach is a novel evaluation method that does not require labeled training data, allowing us to optimize the tuning parameters of our machine learning algorithm. By applying this methodology to Home Mortgage Disclosure Act (HMDA) data, we create a unique dataset that consolidates mortgage applications to the individual applicant level across the United States. Our preferred specification identifies cross-applicants with 92.3% precision.

Pages: 17
Date: 2025-09-02
New Economics Papers: this item is included in nep-cmp
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.philadelphiafed.org/-/media/FRBP/Asset ... ers/2025/wp25-25.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:fip:fedpwp:101559

Ordering information: This working paper can be ordered from

DOI: 10.21799/frbp.wp.2025.25

Access Statistics for this paper

More papers in Working Papers from Federal Reserve Bank of Philadelphia Contact information at EDIRC.
Bibliographic data for series maintained by Beth Paul ().

 
Page updated 2025-09-30
Handle: RePEc:fip:fedpwp:101559