Using Machine Learning to Identify Intonational Segments

Julia Hirschberg and Christine H. Nakatani

The intonational phrase is hypothesized to represent a meaningful unit of analysis in spoken language interpretation. We present results on the identification of intonational phrase boundaries from acoustic features using classification and regression trees (CART). Our training and test data are taken from the Boston Directions Corpus (task-oriented monologue) and the HUB-IV Broadcast News database (monologue and multi-party). Our goal is two-fold: (1) to provide intonational phrase segmentation as a front end for an ASR engine, and (2) to infer topic structure from acoustic-prosodic features. These efforts are aimed at improving the ease and flexibility of retrieving and browsing speech documents from a large audio database.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.