A family of block-wise one-factor distributions for modeling high-dimensional binary data
Matthieu Marbac and
Mohammed Sedki
Computational Statistics & Data Analysis, 2017, vol. 114, issue C, 130-145
Abstract:
A new family of one-factor distributions for modeling high-dimensional binary data is introduced. The model provides an explicit probability for each event, thus avoiding the numeric approximations often made by existing methods. Model interpretation is easy, because each variable is described by two continuous parameters (corresponding to the marginal probability and to the strength of dependency with the other variables) and by one binary parameter (defining if the dependencies are positive or negative). This model is extended by splitting the variables into independent blocks, where each block follows the new one-factor distribution. Finally, a parsimonious version of the model, forcing some equality constraints between the dependency parameters, is proposed. Parameter estimation is carried out by an inference margin procedure, where the second step is achieved by an expectation–maximization algorithm. Model selection is performed by a deterministic approach, which strongly reduces the number of competing models. This consistent approach uses a hierarchical ascendant classification of the variables which selects a narrow subset of models. This selection is based on the empirical version of Cramer’s V. The new model is evaluated on numerical experiments and on a real data set. The procedure is implemented in the R package MvBinary.
Keywords: Binary data; EM algorithm; High-dimensional data; IFM procedure; Model selection; One-factor copulas (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947317300932
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:114:y:2017:i:c:p:130-145
DOI: 10.1016/j.csda.2017.04.010
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().