Marc Sebban, West Indies and Guiana University
We focus on the filter approach of feature selection. We exploit geometrical characterics of the learning set to build an estimation criterion based on a quadratic entropy. The distribution of this criterion is approximately normal, that allows the construction of a non parametrical statistical test to assess the relevance of feature subsets. We use the critical threshold of this test, called the test of Relative Certainty Gain, in a forward selection algorithm. We present some experimental results both on synthetic and natural domains belonging to the UCI database repository, which show significantly improvments on the accuracy estimates.