EconPapers    
Economics at your fingertips  
 

An NMF-framework for Unifying Posterior Probabilistic Clustering and Probabilistic Latent Semantic Indexing

Zhong-Yuan Zhang, Tao Li, Chris Ding and Jie Tang

Communications in Statistics - Theory and Methods, 2014, vol. 43, issue 19, 4011-4024

Abstract: In document clustering, a document may be assigned to multiple clusters and the probabilities of a document belonging to different clusters are directly normalized. We propose a new Posterior Probabilistic Clustering (PPC) model that has this normalization property. The clustering model is based on Nonnegative Matrix Factorization (NMF) and flexible such that if we use class conditional probability normalization, the model reduces to Probabilistic Latent Semantic Indexing (PLSI). Systematic comparison and evaluation indicates that PPC is competitive with other state-of-art clustering methods. Furthermore, the results of PPC are more sparse and orthogonal, both of which are highly desirable.

Date: 2014
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2012.714034 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:43:y:2014:i:19:p:4011-4024

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20

DOI: 10.1080/03610926.2012.714034

Access Statistics for this article

Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe

More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:lstaxx:v:43:y:2014:i:19:p:4011-4024