NoLBERT: A No Lookahead(back) Foundational Language Model

Kakhbod, Ali; Li, Peiyao

NoLBERT: A No Lookahead(back) Foundational Language Model

Ali Kakhbod and Peiyao Li

Abstract: We present NoLBERT, a lightweight, timestamped foundational language model for empirical research -- particularly for forecasting in economics, finance, and the social sciences. By pretraining exclusively on text from 1976 to 1995, NoLBERT avoids both lookback and lookahead biases (information leakage) that can undermine econometric inference. It exceeds domain-specific baselines on NLP benchmarks while maintaining temporal consistency. Applied to patent texts, NoLBERT enables the construction of firm-level innovation networks and shows that gains in innovation centrality predict higher long-run profit growth.

Date: 2025-09, Revised 2025-11
New Economics Papers: this item is included in nep-net
References: View references in EconPapers View complete reference list from CitEc
Citations:

Published in NeurIPS 2025 (GenAI in Finance)

Downloads: (external link)
http://arxiv.org/pdf/2509.01110 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2509.01110

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().