Taalstatistiek*
H. Brandt Corstius
Statistica Neerlandica, 1964, vol. 18, issue 4, 353-367
Abstract:
Discussion of the possibilities and limitations of Language Statistics. Sampling of “normaI” prose is difficult as the “universe of written Dutch” is undefined. What predictions can be made about the contents of tomorrow's newspaper? The letter frequencies are very stable. Frequencies ofbigrams and trigrams of letters bring us towards the word level about which something may be predicted. The sentence level can only be dealt with after a mechanical sentence analysis method is available. The highest level is still outside the scientific domain and belongs to the literary critic. Literary Statistics should be founded on a solid knowledge of Language Statistics. It is considered inappropriate to begin this field with difficult historical problems. The mutual distrust between linguist and statistician requires much tact on both sides. Eventually Language Statistics may become one of the bridges between the “two cultures”. The modern computer is an indispensable tool both for practical reasons (the giant mass of material) and for theoretical ones (the need to give unambiguous definitions of concepts like “sentence”, “word” and “syliable”). The use of Language Statistics in Mechanical Translation research is discussed. Review of the activities in the Netherlands. A one million word count is on its way. The speaker pleads for a National Center for Lexicology and Language Statistics.
Date: 1964
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/j.1467-9574.1964.tb00523.x
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:stanee:v:18:y:1964:i:4:p:353-367
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0039-0402
Access Statistics for this article
Statistica Neerlandica is currently edited by Miroslav Ristic, Marijtje van Duijn and Nan van Geloven
More articles in Statistica Neerlandica from Netherlands Society for Statistics and Operations Research
Bibliographic data for series maintained by Wiley Content Delivery ().