Michael J. Pazzani
The Bayesian classifier is a simple approach to classification that produces results that are easy for people to interpret. In many cases, the Bayesian classifier is at least as accurate as much more sophisticated learning algorithms that produce results that are more difficult for people to interpret. To use numeric attributes with Bayesian classifier often requires the attribute values to be discretized into a number of intervals. We show that the discretization of numeric attributes is critical to successful application of the Bayesian classifier and propose a new method based on iterative improvement search. We compare this method to previous approaches and show that it results in significant reductions in misclassification error and costs on an industrial problem of troubleshooting the local loop in a telephone network. The approach can take prior knowledge into account by improving upon a user-provided set of boundary points, or can operate autonomously.