Marco Ernandes, Giovanni Angelini, Marco Gori, Leonardo Rigutini, Franco Scarselli
Term weighting systems are of crucial importance in Information Extraction and Information Retrieval applications. Common approaches to term weighting are based either on statistical or on natural language analysis. In this paper, we present a new algorithm that capitalizes from the advantages of both the strategies by adopting a machine learning approach. In the proposed method, the weights are computed by a parametric function, called Context Function, that models the semantic influence exercised amongst the terms of the same context. The Context Function is learned from examples, allowing the use of statistical and linguistic information at the same time. The novel algorithm was successfully tested on crossword clues, which represent a case of Single-Word Question Answering.
Subjects: 12. Machine Learning and Discovery; 13. Natural Language Processing
Submitted: Oct 11, 2006