Constructing Applicants from Loan-Level Data: A Case Study of Mortgage Applications
Hadi Elzayn (),
Simon Freyaldenhoven and
Minchul Shin
No 25-05, Working Papers from Federal Reserve Bank of Philadelphia
Abstract:
We develop a clustering-based algorithm to detect loan applicants who submit multiple applications (“cross-applicants”) in a loan-level dataset without personal identifiers. A key innovation of our approach is a novel evaluation method that does not require labeled training data, allowing us to optimize the tuning parameters of our machine learning algorithm. By applying this methodology to Home Mortgage Disclosure Act (HMDA) data, we create a unique dataset that consolidates mortgage applications to the individual applicant level across the United States. Our preferred specification identifies cross-applicants with 93 percent precision
Keywords: clustering; mortgage applications; HMDA (search for similar items in EconPapers)
JEL-codes: C38 C63 C81 G21 R21 (search for similar items in EconPapers)
Pages: 29
Date: 2025-02-04
New Economics Papers: this item is included in nep-big and nep-cmp
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.philadelphiafed.org/-/media/FRBP/Asset ... ers/2025/wp25-05.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:fip:fedpwp:99499
Ordering information: This working paper can be ordered from
DOI: 10.21799/frbp.wp.2025.05
Access Statistics for this paper
More papers in Working Papers from Federal Reserve Bank of Philadelphia Contact information at EDIRC.
Bibliographic data for series maintained by Beth Paul ().