Matching of PATSTAT applications to AIDA firms: discussion of the methodology and results
Francesca Lotti () and
No 166, Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area
This paper is a brief methodological note on the matching of Italian firms in the AIDA database with applicants at the European Patent Office from the PATSTAT database. The need to match data on patent applications with balance-sheet information stems from the importance of patent statistics as a source of information on the innovative performance of firms. Starting from recent efforts to match applicants in PATSTAT with firms in the Bureau van Dijk databases (ORBIS, AMADEUS, FAME), we added an improved cleaning routine to maximize exact matches, followed by an approximate matching based on multiple combination of similarity scores. Starting with 272,475 firms, we matched 49,369 EPO applications in the period 1977-2009. The matching covers 68 percent of EPO applications by Italian firms for the entire period and 89 percent for 2000-2009. Finally, we describe the time, sector, size, geographical location and technology distribution of the matched applications.
Keywords: names harmonization; patents; approximate matching; PATSTAT; AIDA (search for similar items in EconPapers)
JEL-codes: C81 O31 O34 (search for similar items in EconPapers)
New Economics Papers: this item is included in nep-ino, nep-ipr and nep-pr~
References: Add references at CitEc
Citations: View citations in EconPapers (40) Track citations by RSS feed
Downloads: (external link)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:bdi:opques:qef_166_13
Access Statistics for this paper
More papers in Questioni di Economia e Finanza (Occasional Papers) from Bank of Italy, Economic Research and International Relations Area Contact information at EDIRC.
Bibliographic data for series maintained by ().