A complement to lexical query’s search-term selection for emerging technologies: the case of “big data”
Santiago Ruiz-Navas (ruiz.s.aa@m.titech.ac.jp) and
Kumiko Miyazaki (miyazaki.k.ae@m.titech.ac.jp)
Additional contact information
Santiago Ruiz-Navas: Tokyo Institute of Technology
Kumiko Miyazaki: Tokyo Institute of Technology
Scientometrics, 2018, vol. 117, issue 1, No 9, 162 pages
Abstract:
Abstract Obtaining document sets to study emerging technologies is challenging. Researchers studying emerging technologies use lexical queries, e.g., core, expanded and evolutionary, to face this challenge. Creating lexical queries requires the selection of search-terms. Manual, automatic and semi-automatic techniques can be implemented to select search-terms. The current reported processes to select search-terms can be complemented by attending two issues. One is the lack of a systematic process for the selection of search-terms from previous literature, and the second is the evaluation of candidate search-terms’ document retrieval interdependence. We propose two steps to complement the process of selecting search-terms to create lexical queries to study emerging technologies. The first step consists of a process to systematically select search-terms from previous literature. The second is an evaluation of search-terms’ document retrieval interdependence, and for its evaluation, we propose the Significance of Interception Ratio (SIR). We tested our proposed steps setting as a reference the big-data lexical query proposed by Huang et al. (Scientometrics 105:2005–2022, 2015). The tests results show that the proposed steps can complement the current automatic methods to select search-terms. The first step increased around a 24% the recall of the reference lexical query. The increase in the recall was possible because of the addition of 37 additional search-terms and the elimination of three search-terms from the reference lexical query. In the second step (application of the SIR), five search-terms from the reference lexical query were optimized, showing a slight complementary ability when selecting search-terms.
Keywords: Big-data; Emerging technologies; Science reproducibility; Lexical query expansion; Search-terms selection; 97E30; 91B99 (search for similar items in EconPapers)
JEL-codes: C60 C80 O32 Q55 (search for similar items in EconPapers)
Date: 2018
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11192-018-2857-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:117:y:2018:i:1:d:10.1007_s11192-018-2857-9
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-018-2857-9
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla (sonal.shukla@springer.com) and Springer Nature Abstracting and Indexing (indexing@springernature.com).