Paula Buttery and Ted Briscoe
The aim of this research is to investigate the process of grammatical acquisition from real data. In this paper we address the issue of errors. We discuss the definition of an error as well as potential causes of error with reference to real language learning situations. We argue that purely deterministic learning methods can not be robust to errors and introduce a learning system that employs statistical error handling. The implemented learner is composed of three modules: a semantics learning module, a syntax learning module, and a Universal Grammar module based on Chomsky’s Principle and Parameter theory. The learner receives input from a corpus of child-directed sentences annotated with logical forms. This corpus, the annotated Sachs corpus, replicates to some extent the language environment to which a child is exposed. We investigate errors introduced by indeterminacy of meaning of an utterance (i.e. misclassification of word meaning) and errors introduced by indeterminacy in parameter setting. We demonstrate by simulation that a learning system can be robust to these types of error occurring in real child-directed speech if statistical error handling methods are employed.