Machine Learning for Information Extraction
Papers from the AAAI Workshop
Mary Elaine Califf, Chair
The dramatic growth in the number and size of on-line textual information sources has fueled increasing research interest in the information extraction (IE) problem. Given a set of text documents from some domain, an IE system automatically populates a predefined database by extracting relevant fragments from the documents. Manually constructed IE systems cannot adapt to domain changes, and must be adapted for each new problem domain. In consequence, various machine learning (ML) techniques--symbolic learning, inductive logic programming, wrapper induction, statistical methods, and grammar induction--have recently been applied to the IE problem. This research has led to IE systems for several genres--newswire articles, medical texts, Web pages, and Usenet posts--that automatically learn to perform IE. The purpose of this workshop is to provide a forum for exploring the commonality underlying this diversity of problem domains and approaches.