Scaling Point-in-Time Language Models

Kelly, Bryan T.; Malamud, Semyon; Schwab, Johannes; Xu, Teng Andrea

Scaling Point-in-Time Language Models

Bryan T. Kelly, Semyon Malamud, Johannes Schwab and Teng Andrea Xu
Additional contact information
Bryan T. Kelly: Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)
Semyon Malamud: Ecole Polytechnique Federale de Lausanne; Centre for Economic Policy Research (CEPR); Swiss Finance Institute
Johannes Schwab: École Polytechnique Fédérale de Lausanne (EPFL)
Teng Andrea Xu: AQR Capital Management, LLC; École Polytechnique Fédérale de Lausanne (EPFL)

No 26-37, Swiss Finance Institute Research Paper Series from Swiss Finance Institute

Abstract: Large language models trained on unrestricted internet corpora inevitably embed information from the future, introducing lookahead bias that compromises the validity of backtests and causal inference in finance and the social sciences. Point-in-time language models-trained exclusively on text available up to each calendar date-eliminate this leakage by construction, but existing efforts typically produce models that lag substantially behind their unconstrained counterparts. We show that this performance gap can be substantially narrowed through scale. Training decoder-only transformers with up to 4 billion parameters on 1 trillion chronologically filtered tokens from FineWeb, we construct a sequence of monthly model checkpoints spanning 2013-2024. Across a range of common-sense reasoning and language understanding benchmarks, our models approach the performance of leading open-weight models of comparable size (e.g., Gemma-3-4B and LLaMA-7B) trained on temporally unrestricted data, although a performance gap remains on several tasks. Instruction fine-tuning via LoRA further improves downstream usability. We release the complete pipeline-including dataset construction, training infrastructure, and evaluation code-to enable reproducible point-in-time language modeling and to support research applications that require strict temporal validity. Models are available on Hugging Face and code is available on GitHub.

Pages: 24 pages
Date: 2026-04
References: Add references at CitEc
Citations:

Downloads: (external link)
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6681860 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:chf:rpseri:rp2637

Access Statistics for this paper

More papers in Swiss Finance Institute Research Paper Series from Swiss Finance Institute Contact information at EDIRC.
Bibliographic data for series maintained by Ridima Mittal ().