AAAI Publications, Thirtieth AAAI Conference on Artificial Intelligence

Font Size: 
Cross-Lingual Taxonomy Alignment with Bilingual Biterm Topic Model
Tianxing Wu, Guilin Qi, Haofen Wang, Kang Xu, Xuan Cui

Last modified: 2016-02-21


As more and more multilingual knowledge becomes available on the Web, knowledge sharing across languages has become an important task to benefit many applications. One of the most crucial kinds of knowledge on the Web is taxonomy, which is used to organize and classify the Web data. To facilitate knowledge sharing across languages, we need to deal with the problem of cross-lingual taxonomy alignment, which discovers the most relevant category in the target taxonomy of one language for each category in the source taxonomy of another language. Current approaches for aligning cross-lingual taxonomies strongly rely on domain-specific information and the features based on string similarities. In this paper, we present a new approach to deal with the problem of cross-lingual taxonomy alignment without using any domain-specific information. We first identify the candidate matched categories in the target taxonomy for each category in the source taxonomy using the cross-lingual string similarity. We then propose a novel bilingual topic model, called Bilingual Biterm Topic Model (BiBTM), to perform exact matching. BiBTM is trained by the textual contexts extracted from the Web. We conduct experiments on two kinds of real world datasets. The experimental results show that our approach significantly outperforms the designed state-of-the-art comparison methods.


Cross-lingual Taxonomy Alignment; Bilingual Biterm Topic Model; Vector Similarities

Full Text: PDF