SiSOB Data Extraction and Codification: A Tool to Analyse Scientific Careers
Aldo Geuna,
Rodrigo Kataishi,
Manuel Toselli,
Eduardo Guzmán,
Cornelia Lawson,
Ana Fernandez-Zubieta and
Beatriz Barros
Additional contact information
Manuel Toselli: Department of Economics and Statistics Cognetti De Martiis, University of Turin, Italy and BRICK, Collegio Carlo Alberto, Moncalieri (Turin), Italy
Eduardo Guzmán: Department of Languages and Computer Science, University of Malaga, Spain
Ana Fernandez-Zubieta: Institute for Advanced Social Studies - Spanish Council for Scientific Research
Beatriz Barros: Department of Languages and Computer Science, University of Malaga, Spain
Authors registered in the RePEc Author Service: Ana Fernández Zubieta
SPRU Working Paper Series from SPRU - Science Policy Research Unit, University of Sussex Business School
Abstract:
This paper describes the methodology and software tool used to build a database on the careers and productivity of academics, using public information available on the Internet, and provides a first analysis of the data collected for a sample of 360 US scientists funded by the National Institute of Health (NIH) and 291 UK scientists funded by the Biotechnology and Biological Sciences Research Council (BBSRC). The tool’s structured outputs can be used for either econometric research or data representation for policy analysis. The methodology and software tool is validated for a sample of US and UK biomedical scientists, but can be applied to any countries where scientists’ CVs are available in English. We provide an overview of the motivations for constructing the database, and the data crawling and data mining techniques used to transform webpage-based information and CV information into a relational database. We describe the database and the effectiveness of our algorithms and provide suggestions for further improvements. The software developed is released under free software GNU General Public License; the aim is for it to be available to the community of social scientists and economists interested in analysing scientific production and scientific careers, who it is hoped will develop this tool further.
Keywords: Information retrieval; Extraction and data integration; Academic careers; Research productivity; Mobility of Research Scientists (search for similar items in EconPapers)
JEL-codes: C81 C88 I23 O31 (search for similar items in EconPapers)
Date: 2015-01
References: Add references at CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
https://www.sussex.ac.uk/webteam/gateway/file.php? ... una-etal.pdf&site=25
Related works:
Journal Article: SiSOB data extraction and codification: A tool to analyze scientific careers (2015) 
Working Paper: SiSOB Data Extraction and Codification: A tool to analyse scientific careers (2015) 
Working Paper: SiSOB Data Extraction and Codification: A tool to analyse scientific careers (2015) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sru:ssewps:2015-03
Access Statistics for this paper
More papers in SPRU Working Paper Series from SPRU - Science Policy Research Unit, University of Sussex Business School Contact information at EDIRC.
Bibliographic data for series maintained by University of Sussex Business School Communications Team ().