EconPapers    
Economics at your fingertips  
 

New event detection and topic tracking in Turkish

Fazli Can, Seyit Kocberber, Ozgur Baglioglu, Suleyman Kardas, H. Cagdas Ocalan and Erkan Uyar

Journal of the American Society for Information Science and Technology, 2010, vol. 61, issue 4, 802-819

Abstract: Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus on finding the first stories of new events and identifying all subsequent stories on a certain topic defined by a small number of sample stories. In this work, we introduce the first large‐scale TDT test collection for Turkish, and investigate the NED and TT problems in this language. We present our test‐collection‐construction approach, which is inspired by the TDT research initiative. We show that in TDT for Turkish with some similarity measures, a simple word truncation stemming method can compete with a lemmatizer‐based stemming approach. Our findings show that contrary to our earlier observations on Turkish information retrieval, in NED word stopping has an impact on effectiveness. We demonstrate that the confidence scores of two different similarity measures can be combined in a straightforward manner for higher effectiveness. The influence of several similarity measures on effectiveness also is investigated. We show that it is possible to deploy TT applications in Turkish that can be used in operational settings.

Date: 2010
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://doi.org/10.1002/asi.21264

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:61:y:2010:i:4:p:802-819

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:61:y:2010:i:4:p:802-819