A latent variable model for multivariate discretization

Authors:

Stefano Monti
Intelligent Systems Program
University of Pittsburgh
901M CL
Pittsburgh, PA 15217
E-mail: smonti@isp.pitt.edu
Phone: 412-624-8563
Fax: 412-624-6089

Gregory F. Cooper
Center for Biomedical Informatics
University of Pittsburgh
Suite 8084 Forbes Tower
200 Lothrop Street
Pittsburgh, PA 15213-2582
E-mail: gfc@cbmi.upmc.edu
Phone: 412-647-7113 Fax: 412-647-7190

Abstract:

We describe a new method for multivariate discretization based on the use of a latent variable model. The method is proposed as a tool to extend the scope of applicability of machine learning algorithms that handle discrete variables only. Building upon existing class-based discretization methods, we use a latent variable as a proxy class variable, which is then utilized to drive the partition of the value range of each continuous variable. We present experimental results on simulated data aimed at assessing the merits of the proposed method.

Keywords:

Class-based discretization, clustering, finite-mixture model, Bayesian network learning

Availability:

PostScript or PDF