EconPapers    
Economics at your fingertips  
 

Machine learning and natural language processing on the patent corpus: Data, tools, and new measures

Benjamin Balsmeier, Mohamad Assaf, Tyler Chesebro, Gabe Fierro, Kevin Johnson, Scott Johnson, Guan‐Cheng Li, Sonja Lück, Doug O'Reagan, Bill Yeh, Guangzheng Zang and Lee Fleming

Journal of Economics & Management Strategy, 2018, vol. 27, issue 3, 535-553

Abstract: Drawing upon recent advances in machine learning and natural language processing, we introduce new tools that automatically ingest, parse, disambiguate, and build an updated database using U.S. patent data. The tools identify unique inventor, assignee, and location entities mentioned on each granted U.S. patent from 1976 to 2016. We describe data flow, algorithms, user interfaces, descriptive statistics, and a novelty measure based on the first appearance of a word in the patent corpus. We illustrate an automated coinventor network mapping tool and visualize trends in patenting over the last 40 years. Data and documentation can be found at https://console.cloud.google.com/launcher/partners/patents-public-data.

Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (31)

Downloads: (external link)
https://doi.org/10.1111/jems.12259

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jemstr:v:27:y:2018:i:3:p:535-553

Ordering information: This journal article can be ordered from
http://www.blackwell ... ref=1058-6407&site=1

Access Statistics for this article

More articles in Journal of Economics & Management Strategy from Wiley Blackwell
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jemstr:v:27:y:2018:i:3:p:535-553