Economics at your fingertips  

DeepPatent: patent classification with convolutional neural networks and word embedding

Shaobo Li, Jie Hu (), Yuxin Cui and Jianjun Hu ()
Additional contact information
Shaobo Li: Guizhou University
Jie Hu: Guizhou University
Yuxin Cui: University of South Carolina
Jianjun Hu: Guizhou University

Scientometrics, 2018, vol. 117, issue 2, 721-744

Abstract: Abstract Patent classification is an essential task in patent information management and patent knowledge mining. However, this task is still largely done manually due to the unsatisfactory performance of current algorithms. Recently, deep learning methods such as convolutional neural networks (CNN) have led to great progress in image processing, voice recognition, and speech recognition, which has yet to be applied to patent classification. We proposed DeepPatent, a deep learning algorithm for patent classification based on CNN and word vector embedding. We evaluated the algorithm on the standard patent classification benchmark dataset CLEF-IP and compared it with other algorithms in the CLEF-IP competition. Experiments showed that DeepPatent with automatic feature extraction achieved a classification precision of 83.98%, which outperformed all the existing algorithms that used the same information for training. Its performance is better than the state-of-art patent classifier with a precision of 83.50%, whose performance is, however, based on 4000 characters from the description section and a lot of feature engineering while DeepPatent only used the title and abstract information. DeepPatent is further tested on USPTO-2M, a patent classification benchmark data set that we contributed with 2,000,147 records after data cleaning of 2,679,443 USA raw utility patent documents in 637 categories at the subclass level. Our algorithms achieved a precision of 73.88%.

Keywords: Patent classification; Text classification; Convolutional neural network; Machine learning; Word embedding; 94-02 (search for similar items in EconPapers)
JEL-codes: Y (search for similar items in EconPapers)
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link) Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Ordering information: This journal article can be ordered from

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla ().

Page updated 2019-04-09
Handle: RePEc:spr:scient:v:117:y:2018:i:2:d:10.1007_s11192-018-2905-5