Kristen L. Mello and Steven D. Brown
A hybrid data exploration and modeling method that combines multi-way recursive partitioning with the probabilistic reasoning of Bayesian networks is presented. This hybrid method uses the feature extraction capabilities of recursive partitioning to explore the data and construct the network. This manner of feature extraction has the advantage of being able to handle real, raw data sets, which typically have many more features (not all informative) than samples. The resulting network’s uncertain/probabilistic reasoning, and semantic and statistical justification qualities provide the user with a strong predictive ability and understanding of the domain. This method is able to accommodate both continuous and discrete variables, missing data, and non-independent features. In addition, no assumptions are made regarding the underlying structure(s) within the data. Given its strong predictive ability, data handling and information extraction capabilities, and its statistical and semantic justification, applications such as QSAR, risk assessment, and toxicological evaluations could benefit from this method.