Automatic Generation of Association Thesaurus Based on Domain-Specific Text Collection
Aliya Nugumanova (),
Dinara Issabaeva () and
Yerzhan Baiburin ()
Additional contact information
Aliya Nugumanova: East Kazakhstan State Technical University
Dinara Issabaeva: Kumash Nurgaliev College
Yerzhan Baiburin: East Kazakhstan State Technical University
No 201861, Proceedings of International Academic Conferences from International Institute of Social and Economic Sciences
Abstract:
The given work examines distributive approach for automatic generation of the associative thesauri of a definite domain. Distributive approach is based on assumption that presence of associative link among terms of the domain is defined by the statistics of their co-occurence in thematically related discources. The advantage of distributive approach is defined by the fact that it uses raw basic material (for example collection of documents of the domain) and it does not use additional knowledge about the domain. Distributive approach is supported only by mathematical apparatus of statistics and does not take into account neither lexical nor semantic information, that is why this approach let cover extensive lexical space of terms. However it leads to the main shortcoming of the approach, i.e. it produces excessive amount of ?unnecessary? links among words which are less informative from utilitarian point of view. For solving set problems in the given work it is suggested to use special approach represented by combination of methods of distributive statistics, latent semantic analysis and graph theory.
Keywords: LSA; thesaurus; chi-square test; graph (search for similar items in EconPapers)
JEL-codes: C80 (search for similar items in EconPapers)
Pages: 10 pages
Date: 2014-06
References: Add references at CitEc
Citations:
Published in Proceedings of the Proceedings of the 10th International Academic Conference, Vienna, Jun 2014, pages 529-538
Downloads: (external link)
https://iises.net/proceedings/10th-international-a ... id=2&iid=68&rid=1861 First version, 2014
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sek:iacpro:0201861
Access Statistics for this paper
More papers in Proceedings of International Academic Conferences from International Institute of Social and Economic Sciences
Bibliographic data for series maintained by Klara Cermakova ().