AAAI Publications, Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence

Font Size: 
Topical Phrase Extraction from Clinical Reports by Incorporating both Local and Global Context
Gabriele Pergola, Yulan He, David Lowe

Last modified: 2018-06-20


Making sense of words often requires to simultaneously examine the surrounding context of a term as well as the global themes characterizing the overall corpus. Several topic models have already exploited word embeddings to recognize local context, however, it has been weakly combined with the global context during the topic inference. This paper proposes to extract topical phrases corroborating the word embedding information with the global context detected by Latent Semantic Analysis, and then combine them by means of the Polya urn model. To highlight the effectiveness of this combined approach the model was assessed analyzing clinical reports, a challenging scenario characterized by technical jargon and a limited word statistics available. Results show it outperforms the state-of-the-art approaches in terms of both topic coherence and computational cost.


Topic modeling; phrase extraction; polya urn model; medical documents

Full Text: PDF