AAAI Publications, Twenty-Sixth AAAI Conference on Artificial Intelligence

Font Size: 
Margin-Based Feature Selection in Incomplete Data
Qiang Lou, Zoran Obradovic

Last modified: 2012-07-14


This study considers the problem of feature selection in incomplete data. The intuitive approach is to first impute the missing values, and then apply a standard feature selection method to select relevant features. In this study, we show how to perform feature selection directly, without imputing missing values. We define the objective function of the uncertainty margin-based feature selection method to maximize each instance’s uncertainty margin in its own relevant subspace. In optimization, we take into account the uncertainty of each instance due to the missing values. The experimental results on synthetic and 6 benchmark data sets with few missing values (less than 25%) provide evidence that our method can select the same accurate features as the alternative methods which apply an imputation method first. However, when there is a large fraction of missing values (more than 25%) in data, our feature selection method outperforms the alternatives, which impute missing values first.


feature selection; incomplete data;

Full Text: PDF