EconPapers    
Economics at your fingertips  
 

Stop Splitting Hairs: The Problems with Dichotomizing Continuous Data in Language Research

Shawn Hemelstrand and Tomohiro Inoue

No 8usxn, OSF Preprints from Center for Open Science

Abstract: It is common in the language sciences to dichotomize continuous data in order to fit this data to statistical tests. However, several statisticians and methodologists have warned against this practice for years. Many in the language sciences seem unaware of this problem. Because of the lack of modern, robust, and open data simulations related to this issue in the language science literature, our paper provides an empirical investigation of this practice. Across six different simulations, our analysis shows that dichotomization almost universally increases the standard errors, and consequently leads to inaccuracy of tests of statistical significance. Furthermore, effect sizes like are often diminished by the reduction of available information in the data. Our paper concludes by providing suggestions and considerations for future empirical studies.

Date: 2025-01-14
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://osf.io/download/678660fca4385e6cd3e241ae/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:osfxxx:8usxn

DOI: 10.31219/osf.io/8usxn

Access Statistics for this paper

More papers in OSF Preprints from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:osfxxx:8usxn