EconPapers    
Economics at your fingertips  
 

Four Datasets Derived from an Archive of Personal Homepages (1995–2009)

Sean C. Rife
Additional contact information
Sean C. Rife: Department of Psychology, Murray State University, Murray, KY 42071, USA

Data, 2017, vol. 2, issue 2, 1-6

Abstract: While data from social media are easily accessible, understanding how individuals expressed themselves on the Internet in its initial years of public availability (the mid-late 1990s) has proved difficult. In this data deposit, I describe how archival data from Geocities homepages were retrieved and processed to remove non-text data, then further refined to create separate datasets, each of which provides unique insights into modes of personal expression on the early Internet. The present paper describes four datasets, all of which were derived from a larger collection of personal websites: (1) a large corpus of raw text data from Geocities personal homepages, (2) a linguistic analysis of basic psychological properties of the same Geocities pages, using an open-source implementation of the Linguistic Inquiry Word Count (LIWC), (3) a dataset of links between homepages (suitable for network analysis), and (4) a manifest dataset summarizing the size and last update date for each file in the dataset. Data from over 378,000 Geocities pages are included. In addition to providing a detailed description of how these datasets were created, I describe how they might be utilized in future research.

Keywords: Internet; linguistics; online culture; Linguistic Inquiry Word Count (LIWC); corpora; homepages; cyberpsychology; network analysis (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2017
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/2/2/19/pdf (application/pdf)
https://www.mdpi.com/2306-5729/2/2/19/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:2:y:2017:i:2:p:19-:d:101326

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-24
Handle: RePEc:gam:jdataj:v:2:y:2017:i:2:p:19-:d:101326