The Missing 15 Percent of Patent Citations
Cyril Verluise,
Gabriele Cristelli,
Kyle Higham and
Gaétan de Rassenfosse
Additional contact information
Cyril Verluise: Collège de France
Working Papers from Chair of Science, Technology, and Innovation Policy
Abstract:
Patent citations are one of the most commonly-used metrics in the innovation literature. Leading uses of patent-to-patent citations are associated with the quantification of inventions’ quality and the measurement of knowledge flows. Due to their widespread availability, scholars have exploited citations listed on the front-page of patent documents. Citations appearing in the full-text of patent documents have been neglected. We apply modern machine learning methods to extract these citations from the text of USPTO patent documents. Overall, we are able to recover an additional 15 percent of patent citations that could not be found using only front-page data. We show that in-text citations bring a different type of information compared to front-page citations. They exhibit higher text-similarity to the citing patents and alter the ranking of patent importance. The dataset is available at patcit.io (CC-BY-4).
Keywords: Citation; Patent; Open data (search for similar items in EconPapers)
JEL-codes: C81 O30 (search for similar items in EconPapers)
Pages: 62 pages
Date: 2020-12
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ino and nep-ipr
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
https://cdm-repec.epfl.ch/iip-wpaper/WP13.pdf (application/pdf)
Related works:
Working Paper: The Missing 15 Percent of Patent Citations (2020) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:iip:wpaper:13
Access Statistics for this paper
More papers in Working Papers from Chair of Science, Technology, and Innovation Policy Contact information at EDIRC.
Bibliographic data for series maintained by Gaétan de Rassenfosse ().