Peter Cheeseman, Matthew Self, Jim Kelly, Will Taylor, Don Freeman
This paper describes a Bayesian technique for unsupervised classification of data and its computer implementation, AutoClass. Given real valued or discrete data, AutoClass determines the most probable number of classes present in the data, the most probable descriptions of those classes, and each object’s probability of membership in each class. The program performs as well as or better than other automatic classification systems when run on the same data and contains no ad hoc similarity measures or stopping criteria. AutoClass has been applied to several databases in which it has discovered classes representing previously unsuspected phenomena.