EconPapers    
Economics at your fingertips  
 

Patent Text and Long-Run Innovation Dynamics: The Critical Role of Model Selection

Ina Ganguli, Jeffrey Lin, Vitaly Meursault and Nicholas Reynolds

No 32934, NBER Working Papers from National Bureau of Economic Research, Inc

Abstract: As distorted maps may mislead, Natural Language Processing (NLP) models may misrepresent. How do we know which NLP model to trust? We provide comprehensive guidance for selecting and applying NLP representations of patent text. We develop novel validation tasks to evaluate several leading NLP models. These tasks assess how well candidate models align with both expert and non-expert judgments of patent similarity. State-of-the-art language models significantly outperform traditional approaches such as TF-IDF. Using our validated representations, we measure a secular decline in contemporaneous patent similarity: inventors are “spreading out” over an expanding knowledge frontier. This finding is corroborated by declining rates of multiple invention from newly-digitized historical patent interference records. In contrast, selecting another single representation without validating alternatives yields an ambiguous or even opposing trend. Thus, our framework addresses a fundamental challenge of selecting among different black-box NLP models that produce varying economic measurements. To facilitate future research, we plan to provide our validation task data and embeddings for all US patents from 1836–2023.

JEL-codes: C81 L19 O31 (search for similar items in EconPapers)
Date: 2024-09
New Economics Papers: this item is included in nep-big, nep-gro, nep-ino, nep-ipr and nep-tid
Note: DAE PR
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.nber.org/papers/w32934.pdf (application/pdf)
Access to the full text is generally limited to series subscribers, however if the top level domain of the client browser is in a developing country or transition economy free access is provided. More information about subscriptions and free access is available at http://www.nber.org/wwphelp.html. Free access is also available to older working papers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nbr:nberwo:32934

Ordering information: This working paper can be ordered from
http://www.nber.org/papers/w32934
The price is Paper copy available by mail.

Access Statistics for this paper

More papers in NBER Working Papers from National Bureau of Economic Research, Inc National Bureau of Economic Research, 1050 Massachusetts Avenue Cambridge, MA 02138, U.S.A.. Contact information at EDIRC.
Bibliographic data for series maintained by ().

 
Page updated 2025-03-19
Handle: RePEc:nbr:nberwo:32934