Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems: Papers from the AAAI Fall Symposium
Derek Brock, Ramani Duraiswami, and Alexander I. Rudnicky, Cochairs
Robots designed to function as appliances and human surrogates in public and private settings are already being moved from research projects to fully deployed systems. In keeping with the goals of intuitive human-robot interaction, many of these platforms incorporate rudimentary speech communication interfaces, and others are engineered for specific types of listening tasks. Even so, aurally informed behaviors in robots, and their integration with other perceptual and reasoning systems, remain far behind the broad and mostly transparent skills of human beings. Part of the problem is that while much is known about the human physiology of listening, much less is understood about how conceptually bounded information is extracted from the mixtures of sounds that are typically present in interactive settings. This is the problem of auditory scene analysis—how people make sense of what they hear. Just as people do, robots must be able to determine the location of sound sources and their type. They must associate certain sounds with the causes of the sounds and events. When interacting with people, robots must be able to converse on the basis of what they hear and see and may even have additional, nonspeech auditory display functions ranging from alerting to the playback of captured sounds. Social settings also raise practical performance issues for robots such as being interrupted while speaking, excessive ambient noise or quiet, the user’s physical listening distance, the acceptability of being overheard or disturbing others, and so on.
This symposium gathered to- gether researchers in machine listening, speech systems, and general robotics, as well as those in other disciplines, including AI, neuroscience, and the cognitive and social sciences to share results, positions, and insights across boundaries that concern challenges in robotic audition, auditory presentation, and the integration of these functions with other sensory and processing systems in the context of human-robot interaction and the auditory needs and preferences of users.