AAAI Publications, Twenty-Eighth AAAI Conference on Artificial Intelligence

Font Size: 
Semi-Supervised Matrix Completion for Cross-Lingual Text Classification
Min Xiao, Yuhong Guo

Last modified: 2014-06-21

Abstract


Cross-lingual text classification is the task of assigning labels to observed documents in a label-scarce target language domain by using a prediction model trained with labeled documents from a label-rich source language domain. Cross-lingual text classification is popularly studied in natural language processing area to reduce the expensive manual annotation effort required in the target language domain. In this work, we propose a novel semi-supervised representation learning approach to address this challenging task by inducing interlingual features via semi-supervised matrix completion. To evaluate the proposed learning technique, we conduct extensive experiments on eighteen cross language sentiment classification tasks with four different languages. The empirical results demonstrate the efficacy of the proposed approach, and show it outperforms a number of related cross-lingual learning methods.

Keywords


cross-lingual text classification

Full Text: PDF