AAAI Publications, 2013 AAAI Spring Symposium Series

Font Size: 
Modeling Microtext with Higher Order Learning
Christie L. Nelson, Hannah Keiler, William M. Pottenger

Last modified: 2013-03-15

Abstract


Processing data manually is especially problematic during a natural disaster, where aid and response are quickly and urgently needed. In real time scenarios, a difficult yet important problem is to be able to get an accurate picture of needs from streaming data in a short time. When the streaming data includes microtext, this problem becomes even more challenging. In the application of emergency response, modeling microtext in real-time is especially important. Once messages have been classified and/or topics learned, the predicted categories and/or topics can be used by emergency responders to rapidly respond to needs. In this effort, microtext from social media and text messages during the 2010 Haitian earthquake were modeled using novel machine learning algorithms: Higher-Order Naïve Bayes (HONB) and Higher-Order Latent Dirichlet Allocation (HO-LDA). Both illustrate that Higher-Order Learning can be valuable in classifying text data. Higher-Order Learning improves model generalization in online or real-time scenarios when smaller amounts of data are available for learning. Results from this research are promising in that when using samples of training data, the HONB classifier statistically significantly outperformed Naïve Bayes in all trials based on the accuracy metric. Promising results were also obtained in the comparison of HO-LDA versus traditional Latent Dirichlet Allocation.

Keywords


Higher Order Learning, microtext, LDA, Naive Bayes, Higher Order Latent Dirichlet Allocation, Higher Order Naive Bayes, Haitian earthquake

Full Text: PDF