A note on large-scale logistic prediction Using an approximate graphical model to deal with collinearity and missing data

Open Access
Authors
Publication date 07-2017
Journal Behaviormetrika
Volume | Issue number 44 | 2
Pages (from-to) 513-534
Organisations
  • Faculty of Social and Behavioural Sciences (FMG)
  • Faculty of Social and Behavioural Sciences (FMG) - Psychology Research Institute (PsyRes)
Abstract
Large-scale prediction problems are often plagued by correlated predictor variables and missing observations. We consider prediction settings in which logistic regression models are used, and propose a novel approach to make accurate predictions even when predictor variables are highly correlated and only partly observed. Our approach comprises three steps. Firstly, to overcome the collinearity issue, we propose to model the joint distribution of the outcome variable and the predictor variables using the Ising network model. Secondly, to render the application of Ising networks feasible, we use a latent variable representation to apply a low-rank approximation to the network’s connectivity matrix. Finally, we propose an approximation to the latent variable distribution that is used in the representation to handle missing observations. We demonstrate our approach with numerical illustrations.
Document type Article
Language English
Published at https://doi.org/10.1007/s41237-017-0024-x
Downloads
10.1007_s41237-017-0024-x (Final published version)
Permalink to this page
Back