Detecting Binge Drinking and Alcohol-Related Risky Behaviours from Twitter’s Users: An Exploratory Content- and Topology-Based Analysis
Cristina Crocamo,
Marco Viviani,
Francesco Bartoli,
Giuseppe Carrà and
Gabriella Pasi
Additional contact information
Cristina Crocamo: Department of Medicine and Surgery, University of Milano-Bicocca, 20126 Milan, Italy
Marco Viviani: Department of Informatics, Systems, and Communication, University of Milano-Bicocca, 20126 Milan, Italy
Francesco Bartoli: Department of Medicine and Surgery, University of Milano-Bicocca, 20126 Milan, Italy
Giuseppe Carrà: Department of Medicine and Surgery, University of Milano-Bicocca, 20126 Milan, Italy
Gabriella Pasi: Department of Informatics, Systems, and Communication, University of Milano-Bicocca, 20126 Milan, Italy
IJERPH, 2020, vol. 17, issue 5, 1-20
Abstract:
Binge Drinking (BD) is a common risky behaviour that people hardly report to healthcare professionals, although it is not uncommon to find, instead, personal communications related to alcohol-related behaviors on social media. By following a data-driven approach focusing on User-Generated Content, we aimed to detect potential binge drinkers through the investigation of their language and shared topics. First, we gathered Twitter threads quoting BD and alcohol-related behaviours, by considering unequivocal keywords, identified by experts, from previous evidence on BD. Subsequently, a random sample of the gathered tweets was manually labelled, and two supervised learning classifiers were trained on both linguistic and metadata features, to classify tweets of genuine unique users with respect to media, bot, and commercial accounts. Based on this classification, we observed that approximately 55% of the 1 million alcohol-related collected tweets was automatically identified as belonging to non-genuine users. A third classifier was then trained on a subset of manually labelled tweets among those previously identified as belonging to genuine accounts, to automatically identify potential binge drinkers based only on linguistic features. On average, users classified as binge drinkers were quite similar to the standard genuine Twitter users in our sample. Nonetheless, the analysis of social media contents of genuine users reporting risky behaviours remains a promising source for informed preventive programs.
Keywords: binge drinking; vulnerability; risky health behaviour; user-generated content; social media analytics; data science; supervised machine learning (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2020
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/1660-4601/17/5/1510/pdf (application/pdf)
https://www.mdpi.com/1660-4601/17/5/1510/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:17:y:2020:i:5:p:1510-:d:325427
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().