Using Chinese Glyphs for Named Entity Recognition (Student Abstract)

Authors

  • Chan Hee Song University of Notre Dame
  • Arijit Sehanobish Yale University

DOI:

https://doi.org/10.1609/aaai.v34i10.7233

Abstract

Most Named Entity Recognition (NER) systems use additional features like part-of-speech (POS) tags, shallow parsing, gazetteers, etc. Adding these external features to NER systems have been shown to have a positive impact. However, creating gazetteers or taggers can take a lot of time and may require extensive data cleaning. In this work instead of using these traditional features we use lexicographic features of Chinese characters. Chinese characters are composed of graphical components called radicals and these components often have some semantic indicators. We propose CNN based models that incorporate this semantic information and use them for NER. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. We present one of the first studies on Chinese OntoNotes v5.0 and show an improvement of + .64 F1 score over the baseline. We present a state-of-the-art (SOTA) F1 score of 71.81 on the Weibo dataset, show a competitive improvement of + 0.72 over baseline on the ResumeNER dataset, and a SOTA F1 score of 96.49 on the MSRA dataset.

Downloads

Published

2020-04-03

How to Cite

Song, C. H., & Sehanobish, A. (2020). Using Chinese Glyphs for Named Entity Recognition (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 34(10), 13921-13922. https://doi.org/10.1609/aaai.v34i10.7233

Issue

Section

Student Abstract Track