Paul Jacobs, Uri Zemik
Language acquisition addresses two important text processing issues. The immediate problem is understanding a text in spite of the existence of lexical gaps. The long term issue is that the understander must incorporate new words into its lexicon for future use. This paper describes an approach to constructing new lexical entries in a gradual process by analyzing a sequence of example texts. This approach permits the graceful tolerance of new words while enabling the automated extension of the lexicon. Each new acquired lexeme starts as a set of assumptions derived from the analysis of each word in a textual context. A variety of knowledge sources, including morphological, syntactic, semantic, and contextual knowledge, determine the assumptions. These assumptions, along with justifications and dependencies, are interpreted and refined by a learning program that ultimately updates the system’s lexicon. This approach uses existing linguistic knowledge, and generalization of multiple occurrences, to create new operational lexical entries.