Yuhong Guo, Russ Greiner
Bayesian belief nets (BNs) are often used for classification tasks, typically to return the most likely class label for a specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function --- viz., likelihood, rather than classification accuracy --- typically by first using some model selection criterion to identify an appropriate graphical structure, then finding good parameters for that structure. This paper considers a number of possible criteria for selecting the best structure, both generative (ie, based on likelihood; BIC, BDe) and discriminative (ie, Conditional BIC (CBIC), resubstitution Classification Error (CE) and Bias2+Variance (BV) ). We empirically compare these criteria against a variety of different ``correct BN structures'', both real-world and synthetic, over a range of complexities. We also explore different ways to set the parameters, dealing with two issues: (1) Should we seek the parameters that maximize likelihood versus the ones that maximize conditional likelihood? (2) Should we use (i) the entire training sample first to learn the best parameters and then to evaluate the models, versus (ii) only a partition for parameter estimation and another partition for evaluation (cross-validation)? Our results show that the discriminative BV model selection criterion is one of the best measures for identifying the optimal structure, while the discriminative CBIC performs poorly; that we should use the parameters that maximize likelihood; and that it is typically better to use cross-validation here.
Content Area: 12. Machine Learning
Subjects: 12. Machine Learning and Discovery; 3.4 Probabilistic Reasoning
Submitted: May 10, 2005