Stereotypical gender actions can be extracted from web text
Amaç Herdağdelen and
Marco Baroni
Journal of the American Society for Information Science and Technology, 2011, vol. 62, issue 9, 1741-1749
Abstract:
We extracted gender‐specific actions from text corpora and Twitter, and compared them with stereotypical expectations of people. We used Open Mind Common Sense (OMCS), a common sense knowledge repository, to focus on actions that are pertinent to common sense and daily life of humans. We use the gender information of Twitter users and web‐corpus‐based pronoun/name gender heuristics to compute the gender bias of the actions. With high recall, we obtained a Spearman correlation of 0.47 between corpus‐based predictions and a human gold standard, and an area under the ROC curve of 0.76 when predicting the polarity of the gold standard. We conclude that it is feasible to use natural text (and a Twitter‐derived corpus in particular) in order to augment common sense repositories with the stereotypical gender expectations of actions. We also present a dataset of 441 common sense actions with human judges' ratings on whether the action is typically/slightly masculine/feminine (or neutral), and another larger dataset of 21,442 actions automatically rated by the methods we investigate in this study.
Date: 2011
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://doi.org/10.1002/asi.21579
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:62:y:2011:i:9:p:1741-1749
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890
Access Statistics for this article
More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().