Oren Glickman and Rosie Jones
All components of a typical IE system have been the object of some machine learning research, motivated by the need to improve time taken to transfer to new domains. In this paper we survey such methods and assess to what extent they can help create a complete IE system that can be easily adapted to new domains. We also lay out a general prescription for an IE system in a new domain, employing existing components and technologies where possible. The goal is a system that can be adapted to a new domain with minimal human intervention (say by someone who may be a domain expert but need not be a computational linguist). We propose research directions for automating the process further, reducing the need for hand-tagged training data by relying on biases intrinsic to the information extraction task, and employing boot-strapping and active learning.