Tackling the Poor Assumptions of Naive Bayes Text Classifiers

Authors

Jason D. Rennie

Lawrence Shih

Jaime Teevan

and David Karger

Proceedings:

Proceedings of the Twentieth International Conference on Machine Learning

Volume

Issue:

Proceedings of the Twentieth International Conference on Machine Learning

Track:

Contents

Downloads:

Download PDF

Abstract:

Naive Bayes is often used as a baseline text classiffication because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some the problems with Naive Bayes classifiers, addressing both systemic issues as well as problems that arise because text is not actually generated according to a multinomial model. We find that our simple corrections result in fast algorithm that is competitive with state-of-the-art text classification algorithms such as the Support Vector Machine.

ICML

Proceedings of the Twentieth International Conference on Machine Learning

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.