EconPapers    
Economics at your fingertips  
 

A dataset of scientific citations in U.S. patent Office Actions

Kyle Higham, Hannah Kotula, Emma Scharfmann, Steve Gong and Gaétan de Rassenfosse
Additional contact information
Kyle Higham: Motu Economic and Public Policy Research
Hannah Kotula: Motu Economic and Public Policy Research
Emma Scharfmann: University of California, Berkeley
Steve Gong: Google
Gaétan de Rassenfosse: Ecole polytechnique fédérale de Lausanne

Working Papers from Chair of Science, Technology, and Innovation Policy

Abstract: We present a curated dataset of about 850,000 citations extracted from Office Actions issued by examiners at the United States Patent and Trademark Office. These references, historically underused due to accessibility challenges, provide a granular view into the patent examination process and complement traditional front-page citation data. We classify each citation into one of 14 categories and focus on the 265,000 references to scientific literature, which we parse, clean, and disambiguate using machine learning and external bibliographic services. To enhance reusability, disambiguated records are linked to OpenAlex, a comprehensive research metadata platform. The dataset enables new research on examiner behavior, science–technology linkages, and the construction of citation-based metrics. All data and code are openly available to facilitate reuse across disciplines.

Keywords: citation; patent; office actions; open data; non-patent literature; NPL (search for similar items in EconPapers)
JEL-codes: C81 D83 K29 O34 (search for similar items in EconPapers)
Pages: 12 pages
Date: 2026-02
References: Add references at CitEc
Citations:

Downloads: (external link)
https://cdm-repec.epfl.ch/iip-wpaper/WP31.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:iip:wpaper:31

Access Statistics for this paper

More papers in Working Papers from Chair of Science, Technology, and Innovation Policy Contact information at EDIRC.
Bibliographic data for series maintained by Gaétan de Rassenfosse ().

 
Page updated 2026-02-23
Handle: RePEc:iip:wpaper:31