Mark G. Core and Lenhart K. Schubert
Most dialog systems ignore the problem of speech repairs and editing terms (urn, uh, etc.) or use preprocessing techniques to eliminate them from the input. These systems also typically enforce a strict turn-taking protocol that does not allow speakers to interrupt each other. This paper describes a parser that can process input containing editing terms, speech repairs, and second speaker interruptions, and include these phenomena in its output. Such a parser allows a dialog system to reason about why editing terms were uttered; maybe the speaker was uncertain, embarrassed, reluctant to commit, etc. The reparandum (corrected material in a speech repair) also plays an important role as it may be referenced later: take the oranges to Elmira uh I mean take them to Coming. Reparanda may also give insight into the speaker’s intentions: pick up tankers in uh how many ears can an engine pull?. Second speaker interruptions can provide evidence that the interrupter is listening (if they utter a backchannel such as uh-huh) or that neither speaker is hearing the other (both speakers are talking at the same time). This type of evidence is crucial for applications such as business meeting summarization.