EconPapers    
Economics at your fingertips  
 

Towards the automation of address identification

Fernanda Morillo (), Javier Aparicio (), Borja González-Albo () and Luz Moreno ()
Additional contact information
Fernanda Morillo: Spanish National Research Council (CSIC)
Javier Aparicio: Spanish National Research Council (CSIC)
Borja González-Albo: Spanish National Research Council (CSIC)
Luz Moreno: Spanish National Research Council (CSIC)

Scientometrics, 2013, vol. 94, issue 1, No 12, 207-224

Abstract: Abstract A new semi-automatic method is presented to standardize or codify addresses, in order to produce bibliometric indicators from bibliographic databases. The hypothesis is that this new method is very trustworthy to normalize authors’ addresses, easy and quick to obtain. As a way to test the method, a set of already hand-coded data is chosen to verify its reliability: 136,821 Spanish documents (2006–2008) downloaded previously from the Web of Science database. Unique addresses from this set were selected to produce a list of keywords representing various institutional sectors. Once the list of terms is obtained, addresses are standardized with this information and the result is compared to the previous hand-coded data. Some tests are done to analyze possible association between both systems (automatic and hand-coding), calculating measures of recall and precision, and some statistical directional and symmetric measures. The outcome shows a good relation between both methods. Although these results are quite general, this overview of institutional sectors is a good way to develop a second approach for the selection of particular centers. This system has some new features because it provides a method based on the previous non-existence of master lists or tables and it has a certain impact on the automation of tasks. The validity of the hypothesis has been proved taking into account not only the statistical measures, but also considering that the obtaining of general and detailed scientific output is less time-consuming and will be even less due to the feedback of these master tables reused for the same kind of data. The same method could be used with any country and/or database creating a new master list taking into account their specific characteristics.

Keywords: Address identification; Data mining; Automatic standardization; Performance evaluation; Bibliographic databases (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://link.springer.com/10.1007/s11192-012-0733-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:94:y:2013:i:1:d:10.1007_s11192-012-0733-6

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-012-0733-6

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:scient:v:94:y:2013:i:1:d:10.1007_s11192-012-0733-6