How To Kill Inventors: Testing The Massacrator© Algorithm For Inventor Disambiguation
Francesco Lissoni and
Gianluca TARASCONI (KiTES, Università Bocconi)
Cahiers du GREThA from Groupe de Recherche en Economie Théorique et Appliquée(GREThA)
Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the Massacrator© algorithm, originally proposed by Lissoni et al. (2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (2009) we describe disambiguation as a 3-step process: cleaning&parsing, matching, and filtering. By means of sensitivity analysis, based on MonteCarlo simulations, we show how various filtering criteria can be manipulated in order to obtain optimal combinations of precision and recall (type I and type II errors). We also show how these different combinations generate different results for applications to studies on inventors\' productivity, mobility, and networking. The filtering criteria based upon information on inventors\' addresses are sensitive to data quality, while those based upon information on co-inventorship networks are always effective. Details on data access and data quality improvement via feedback collection are also discussed.
Keywords: patent data; inventors; name disambiguation (search for similar items in EconPapers)
JEL-codes: C15 C81 O34 (search for similar items in EconPapers)
New Economics Papers: this item is included in nep-cmp, nep-ino, nep-ipr and nep-pr~
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (11) Track citations by RSS feed
Downloads: (external link)
Journal Article: How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation (2014)
Working Paper: How to kill inventors: testing the Massacrator© algorithm for inventor disambiguation (2014)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:grt:wpegrt:2012-29
Access Statistics for this paper
More papers in Cahiers du GREThA from Groupe de Recherche en Economie Théorique et Appliquée(GREThA) Contact information at EDIRC.
Bibliographic data for series maintained by Valerio Sterzi ().