EconPapers    
Economics at your fingertips  
 

Automatic tagging with existing and novel tags

Junhui Wang, Xiaotong Shen, Yiwen Sun and Annie Qu

Biometrika, 2017, vol. 104, issue 2, 273-290

Abstract: SummaryAutomatic tagging by key words and phrases is important in multi-label classification of a document. In this paper, we first introduce a tagging loss to measure the discrepancy between predicted and actual tag sets, which is expressed in terms of a sum of weighted pairwise margins between two tags by their degree of similarity. We then construct a regularized empirical loss to incorporate linguistic knowledge, and identify a tagger maximizing the separations between the pairwise margins. One salient feature of the proposed method is its capability to identify novel tags absent from a training sample by using their similarity to existing tags. Computationally, the proposed method is implemented by an alternating direction method of multipliers, integrated with a difference convex algorithm. This permits scalable computation. We show that the method achieves accurate tagging, and that it compares favourably with existing methods. Finally, we apply the proposed method to tagging a Reuters news dataset.

Keywords: Alternating direction method of multipliers; Large margin; Multi-label classification; Scalability; Social bookmarking system; Text mining (search for similar items in EconPapers)
Date: 2017
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1093/biomet/asx016 (application/pdf)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:oup:biomet:v:104:y:2017:i:2:p:273-290.

Ordering information: This journal article can be ordered from
https://academic.oup.com/journals

Access Statistics for this article

Biometrika is currently edited by Paul Fearnhead

More articles in Biometrika from Biometrika Trust Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK.
Bibliographic data for series maintained by Oxford University Press ().

 
Page updated 2025-03-19
Handle: RePEc:oup:biomet:v:104:y:2017:i:2:p:273-290.