Toepassing van de bezettingstheorie op een coderingsvraagstuk
Pieter de Wolff
Statistica Neerlandica, 1951, vol. 5, issue 5‐6, 161-170
Abstract:
An application of the occupancy theory on a coding problem For large scale registration of certain medical phenomena it is very useful to have the disposal of a coding system which enables the reconstruction of the code number of a given person at any moment from the smallest possible number of personals. For this purpose, place and date of birth and sex may be used. These items, however, are not sufficient as they do not permit to distinguish between persons of the same sex, born at the same day in the same place. For Amsterdam this number is about 20. Therefore the Inspector of Public Health considered the possibility of including the birthday of the mother in the data and requested the Municipal Bureau of Statistics of Amsterdam to estimate the percentage of duplications which would occur when this enlarged system of personals were used. A special investigation for September 1951 turned out a rate of 10% which seemed to be rather high even if the number of twins for which duplications by their very nature would occur was taken into account. So it became interesting to know which rate theoretically might be expected. If we suppose the birthdays of the mothers of the persons born on the same day and in the same place to be evenly distributed over the days of the year, it is easy to calculate the expected number of duplications. We are then led to consider the following occupancy problem: n objects (number of births on a given day) are distributed over N places (possible birthdays of the mothers), the probability to occupy a certain place is the same for all objects and independent of the place. What is the probability that no places will be empty, n1 occupied by only one object, n2 by two objects etc.? It is shown that this probability is equal to From this formula the expectation and the variance of the number of persons with equal code numbers can easily be derived. The empirical results for 4 different months of 1951 (including the already mentioned month of September) proved to be in good agreement with these theoretical calculations.
Date: 1951
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1111/j.1467-9574.1951.tb00584.x
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:stanee:v:5:y:1951:i:5-6:p:161-170
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=0039-0402
Access Statistics for this article
Statistica Neerlandica is currently edited by Miroslav Ristic, Marijtje van Duijn and Nan van Geloven
More articles in Statistica Neerlandica from Netherlands Society for Statistics and Operations Research
Bibliographic data for series maintained by Wiley Content Delivery ().