EconPapers    
Economics at your fingertips  
 

Iterative Denoising for Cross-Corpus Discovery

Carey E. Priebe (), David J. Marchette, Youngser Park, Edward J. Wegman, Jeffrey L. Solka, Diego A. Socolinsky, Damianos Karakos, Ken W. Church, Roland Guglielmi, Ronald R. Coifman, Dekang Lin, Dennis M. Healy, Marc Q. Jacobs and Anna Tsao
Additional contact information
Carey E. Priebe: AlgoTek, Inc.
Youngser Park: Johns Hopkins U.
Edward J. Wegman: AlgoTek, Inc.
Diego A. Socolinsky: AlgoTek, Inc.
Damianos Karakos: Johns Hopkins U.
Ken W. Church: AlgoTek, Inc.
Roland Guglielmi: AlgoTek, Inc.
Ronald R. Coifman: AlgoTek, Inc.
Dekang Lin: AlgoTek, Inc.
Dennis M. Healy: DARPA
Marc Q. Jacobs: AlgoTek, Inc.
Anna Tsao: AlgoTek, Inc.

A chapter in COMPSTAT 2004 — Proceedings in Computational Statistics, 2004, pp 381-392 from Springer

Abstract: Abstract We consider the problem of statistical pattern recognition in a heterogeneous, high-dimensional setting. In particular, we consider the search for meaningful cross-category associations in a heterogeneous text document corpus. Our approach involves “iterative denoising ” — that is, iteratively extracting (corpus-dependent) features and partitioning the document collection into sub-corpora. We present an anecdote wherein this methodology discovers a meaningful cross-category association in a heterogeneous collection of scientific documents.

Keywords: Text document processing; statistical pattern recognition; dimensionality reduction (search for similar items in EconPapers)
Date: 2004
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-7908-2656-2_31

Ordering information: This item can be ordered from
http://www.springer.com/9783790826562

DOI: 10.1007/978-3-7908-2656-2_31

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2026-06-01
Handle: RePEc:spr:sprchp:978-3-7908-2656-2_31