Julia Hirschberg and Christine H. Nakatani
The intonational phrase is hypothesized to represent a meaningful unit of analysis in spoken language interpretation. We present results on the identification of intonational phrase boundaries from acoustic features using classification and regression trees (CART). Our training and test data are taken from the Boston Directions Corpus (task-oriented monologue) and the HUB-IV Broadcast News database (monologue and multi-party). Our goal is two-fold: (1) to provide intonational phrase segmentation as a front end for an ASR engine, and (2) to infer topic structure from acoustic-prosodic features. These efforts are aimed at improving the ease and flexibility of retrieving and browsing speech documents from a large audio database.