Knowing what you get when seeking semantic similarity: exploring classic NLP method biases

Saint-Charles, Johanne; Mongeau, Pierre; Renaud-Desjardins, Louis

Knowing what you get when seeking semantic similarity: exploring classic NLP method biases

Johanne Saint-Charles, Pierre Mongeau and Louis Renaud-Desjardins

Chapter 3 in Handbook of Social Computing, 2024, pp 27-46 from Edward Elgar Publishing

Abstract: Various Natural Language Processing (NLP) methods are called upon to establish similarity between texts in the context of socio-semantic studies. This chapter addresses the methodological diversity in the field by asking to what extent classical NLP methods converge in their identification of similarity between various texts. We compare the results of well-known (and often used) NLP methods in social sciences and humanities: Jaccard, LDA, LSA and TF–IDF, on corpora with different characteristics. Results show that these methods have specific bias and cannot be substituted for one another. Our observations invite social sciences and humanities scholars to consider new criteria for the selection of an NLP method suited to their research objectives.

Keywords: Business and Management; Innovations and Technology; Sociology and Social Policy (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.elgaronline.com/doi/10.4337/9781803921259.00009 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:elg:eechap:21469_3

Ordering information: This item can be ordered from
http://www.e-elgar.com

Access Statistics for this chapter

More chapters in Chapters from Edward Elgar Publishing
Bibliographic data for series maintained by Darrel McCalla ().