Unsupervised Segmentation of Bibliographic Elements with Latent Permutations
Additional contact information
Tomonari Masada: Nagasaki University, Japan
International Journal of Organizational and Collective Intelligence (IJOCI), 2011, vol. 2, issue 2, 49-62
This paper introduces a new approach for large-scale unsupervised segmentation of bibliographic elements. The problem is segmenting a citation given as an untagged word token sequence into subsequences so that each subsequence corresponds to a different bibliographic element (e.g., authors, paper title, journal name, publication year, etc.). The same bibliographic element should be referred to by contiguous word tokens. This constraint is called contiguity constraint. The authors meet this constraint by using generalized Mallows models, effectively applied to document structure learning by Chen, Branavan, Barzilay, and Karger (2009). However, the method works for this problem only after modification. Therefore, the author proposes strategies to make the method applicable to this problem.
References: Add references at CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 4018/joci.2011040104 (application/pdf)
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:igg:joci00:v:2:y:2011:i:2:p:49-62
Access Statistics for this article
More articles in International Journal of Organizational and Collective Intelligence (IJOCI) from IGI Global
Bibliographic data for series maintained by Journal Editor ().