EconPapers    
Economics at your fingertips  
 

The frequency spectrum of finite samples from the intermittent silence process

Ramon Ferrer‐i‐Cancho and Ricard Gavaldà

Journal of the American Society for Information Science and Technology, 2009, vol. 60, issue 4, 837-843

Abstract: It has been argued that the actual distribution of word frequencies could be reproduced or explained by generating a random sequence of letters and spaces according to the so‐called intermittent silence process. The same kind of process could reproduce or explain the counts of other kinds of units from a wide range of disciplines. Taking the linguistic metaphor, we focus on the frequency spectrum, i.e., the number of words with a certain frequency, and the vocabulary size, i.e., the number of different words of text generated by an intermittent silence process. We derive and explain how to calculate accurately and efficiently the expected frequency spectrum and the expected vocabulary size as a function of the text size.

Date: 2009
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/asi.21033

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:60:y:2009:i:4:p:837-843

Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890

Access Statistics for this article

More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jamist:v:60:y:2009:i:4:p:837-843