Harry Zhang, Liangxiao Jiang, Jiang Su
The conditional independence assumption of naive Bayes essentially ignores attribute dependencies and is often violated. On the other hand, although a Bayesian network can represent arbitrary attribute dependencies, learning an optimal Bayesian network from data is intractable. The main reason is that learning the optimal structure of a Bayesian network is extremely time consuming. Thus, a Bayesian model without structure learning is desirable. In this paper, we propose a novel model, called hidden naive Bayes (HNB). In an HNB, a hidden parent is created for each attribute which combines the influences from all other attributes. We present an approach to creating hidden parents using the average of weighted one-dependence estimators. HNB inherits the structural simplicity of naive Bayes and can be easily learned without structure learning. We propose an algorithm for learning HNB based on conditional mutual information. We experimentally test HNB in terms of classification accuracy, using the 36 UCI data sets recommended by Weka, and compare it to naive Bayes, C4.5, SBC, NBTree, CL-TAN, and AODE. The experimental results show that HNB outperforms naive Bayes, C4.5, SBC, NBTree, and CL-TAN, and is competitive with AODE.
Subjects: 12. Machine Learning and Discovery; 3.4 Probabilistic Reasoning
Submitted: May 2, 2005