Wlodek Zadrozny, Catherine Wolf, Nanda Kambhatla and Yiming Ye, IBM T. J. Watson Research Center
We have built a set of integrated AI systems (called conversation machines) to enable transaction processing over the telephone for limited domains like stock trading and banking. The conversation machines integrate the state-of-the- art technologies from computer telephony, continuous speech recognition, natural language processing and human-computer interaction. Users can interact with these systems using natural language to process simple transactions. We are currently installing a prototype conversation machine at a customer site (a large bank), while continuing research on each of the modules mentioned above and their integration. In this paper, we describe the architecture of conversation machines and explain the design choices related to natural language dialog design, speech recognition errors, and human-computer interaction. We also discuss our experience with the new "market-driven research" methodology currently being tested at our company, of which the conversation machines project is an example. Our experience suggests that with this new methodology we can build integrated natural language dialog systems, even when working with error-prone recognition engines and imperfect grammars, by designing the dialog flow to reduce the likelihood of errors, and to enable quick error recovery. In this process, having a customer allows us to make more realistic design choices.