Haifeng Li, Keshu Zhang, Tao Jiang
The EM algorithm heavily relies on the interpretation of observations as incomplete data but it does not have any control on the uncertainty of missing data. To effectively reduce the uncertainty of missing data, we present a regularized EM algorithm that penalizes the likelihood with the mutual information between the missing data and the incomplete data (or the conditional entropy of the missing data given the observations). The proposed method maintains the advantage of the conventional EM algorithm, such as reliable global convergence, low cost per iteration, economy of storage, and ease of programming. We also apply the regularized EM algorithm to fit the finite mixture model. Our theoretical analysis and experiments show that the new method can efficiently fit the models and effectively simplify over-complicated models.
Content Area: 12. Machine Learning
Subjects: 12. Machine Learning and Discovery
Submitted: May 6, 2005