AAAI Publications, Twenty-Seventh AAAI Conference on Artificial Intelligence

Font Size: 
An Effective Approach for Imbalanced Classification: Unevenly Balanced Bagging
Guohua Liang, Anthony G. Cohn

Last modified: 2013-06-29

Abstract


Learning from imbalanced data is an important problem in data mining research. Much research has addressed the problem of imbalanced data by using sampling methods to generate an equally balanced training set to improve the performance of the prediction models, but it is unclear what ratio of class distribution is best for training a prediction model. Bagging is one of the most popular and effective ensemble learning methods for improving the performance of prediction models; however, there is a major drawback on extremely imbalanced data-sets. It is unclear under which conditions bagging is outperformed by other sampling schemes in terms of imbalanced classification. These issues motivate us to propose a novel approach, unevenly balanced bagging (UBagging) to boost the performance of the prediction model for imbalanced binary classification. Our experimental results demonstrate that UBagging is effective and statistically significantly superior to single learner decision trees J48 (SingleJ48), bagging, and equally balanced bagging (BBagging) on 32 imbalanced data-sets.

Full Text: PDF