Readability and the Web

Martin, Ludger; Gottron, Thomas

Readability and the Web

Ludger Martin and Thomas Gottron
Additional contact information
Ludger Martin: Institute of Computer Science, Johannes Gutenberg University Mainz, Mainz 55128, Germany
Thomas Gottron: Institute for Web Science and Technologies, Universität Koblenz-Landau, Koblenz 56070, Germany

Future Internet, 2012, vol. 4, issue 1, 1-15

Abstract: Readability indices measure how easy or difficult it is to read and comprehend a text. In this paper we look at the relation between readability indices and web documents from two different perspectives. On the one hand we analyse how to reliably measure the readability of web documents by applying content extraction techniques and incorporating a bias correction. On the other hand we investigate how web based corpus statistics can be used to measure readability in a novel and language independent way.

Keywords: web document readability; content extraction; corpus statistics (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2012
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1999-5903/4/1/238/pdf (application/pdf)
https://www.mdpi.com/1999-5903/4/1/238/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:4:y:2012:i:1:p:238-252:d:16606

Access Statistics for this article

Future Internet is currently edited by Ms. Afra Wang

More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().