A Probabilistic Derivation of LASSO and L12-Norm Feature Selections
LASSO and ℓ2,1-norm based feature selection had achieved success in many application areas. In this paper, we first derive LASSO and ℓ1,2-norm feature selection from a probabilistic framework, which provides an independent point of view from the usual sparse coding point of view. From here, we further propose a feature selection approach based on the probability-derived ℓ1,2-norm. We point out some inflexibility in the standard feature selection that the feature selected for all different classes are enforced to be exactly the same using the widely used ℓ2,1-norm, which enforces the joint sparsity across all the data instances. Using the probabilityderived ℓ1,2-norm feature selection, allowing certain flexibility that the selected features do not have to be exactly same for all classes, the resulting features lead to better classification on six benchmark datasets.