- SPEECH -
General Index by Topic to AI in the news
AI Topics Home  
 

 

October 25, 2007: The E-Learning Adventure. By Nicole Girard. TechNewsWorld. "Improvements in the processing power of personal computers combined with Internet delivery applications provide a tremendous opportunity for novel approaches to preparedness training. The power of virtual learning environments lies in creating 3-D spaces that give users a sense of learning by doing. ... A simulation-based training game designed to equip players with the ability to deal with crises in a military situation was developed by research and development firm Stottler Henke. The system -- developed by the Navy to train tactical action officers (TAO) -- allows players to train within a battlefield simulation. As officers second in command to the captain, they are the individuals who run the ship in a crisis situation. 'In real life, the captain commands a cadre of about 15 people,' Jim Ong, group manager for Stottler Henke, told TechNewsWorld. Ong leads the development of artificial intelligence-based systems for training, performance support and decision support. ... Instead of pressing buttons on a dashboard, the player is talking to a person through the use of automated speech recognition and speech synthesis provided by a tool called 'Symbionic.' It's an intelligence agent toolkit used to monitor the students actions. It consists primarily of voice commands and questions and assesses whether or not the student is doing the right thing or not.... Stottler Henke specializes in turnkey applications and developing tools. Their main area of expertise is developing advanced training systems. Their core competition is artificial intelligence. 'We use [artificial] intelligence to make training more effective,' Ong said. 'A lot of simulations used by corporations tend to be pretty simple. Artificial entertainment games tend to have the intelligence of the characters that are in the game, the non-live characters.' The way to make the characters smarter is to enable the sim to automatically assess the students' performance, Ong said."
>>> Education, Military, Video Games, Speech, Applications

October 19, 2007: Newsmaker - Gates still finding his voice. By Ina Fried. CNET News.com. "Bill Gates has been saying for years that one day soon we will use handwriting, voice and touch to control our computers. He's still saying that. In an interview with CNET News.com, Gates talks about some of the ways that speech recognition has already made inroads and discusses some of the places it will eventually go. ... Q: When did you really first see the possibilities of voice? Was there a real early demo you saw years ago that sort of--you saw it and could really see the possibilities? Gates: Well, certainly the idea that computers should deal with voice has been around a long time. It's kind of a natural way to communicate. In the 1970s, DARPA was funding people, including people at Harvard, to do speech recognition. And so people kind of thought, hey, this should be easy to do. The dream of computers understanding voice goes way back. And the dream that the data network and the voice network would be one in the same goes way back as well. ... [Q:] What are some of the areas where you see voice going that people aren't necessarily thinking about today? Gates: To me, voice is in the broad realm of natural interface. ..."

>>> Speech, Natural Language Processing, Interfaces, Applications, Interviews

October 11, 2007: Researchers fine-tune F-35 pilot-aircraft speech system. By John Schutte. Air Force Materiel Command News. "When the first F-35 Lightning II rolls out in 2008, communications between pilot and aircraft will enter a new era thanks in part to testing and analysis conducted at the Air Force Research Laboratory's Human Effectiveness Directorate. The F-35 will be the first U. S. fighter aircraft with a speech recognition system able to 'hear' a pilot's spoken commands to manage various aircraft subsystems, such as communications and navigation. ... Currently pilots must press buttons, flip switches or glance at instruments for status information. The new system not only simplifies a pilot's workload but increases safety and efficiency, since pilots can remain focused on flying the aircraft and scrutinizing the combat environment. ... SRI International developed the DynaSpeak® speech recognition software as a highly accurate system for noisy environments, specifically for embedded devices like personal digital assistants, in-car navigation systems and avionics systems, Mr. Williamson said. It is speaker-independent, meaning a pilot can use it without first 'training' the system to his or her voice, which took up to an hour on previous experimental systems. SRI International is working with integrating contractor Adacel Systems, Inc., to tailor the system for the F-35 Joint Strike Fighter's airborne environment."
>>> Speech, Military, Applications

October 8, 2007: Aston algorithm. The Engineer Online. "A new study on the way humans perceive and organise speech could lead to better algorithms that will improve the accuracy of speech recognition systems. Researchers from Aston University hope the results of their EPSRC-funded project will indirectly lead to speech recognition systems that better target a person's voice in the midst of loud background noise, such as that found on the factory floor. The research could also offer significant improvements to hearing aids."
>>> Speech, Applications

October 2, 2007: Natural Language Understanding and Conversational Dialogue - A Different Kind of Self-Service Speech Recognition. By Stefania Viscusi. TMCnet. "For more insight into natural language understanding in speech technologies, I took some time to ask Luis Valles, Chief Scientist at GyrusLogic, some questions on the topic. [Q] What is Natural Language Understanding? [A] Natural Language Understanding (NLU) or Conversational Dialogue is the capability for a user to say and/or ask anything, and the system understanding what the user meant, together with the system finding an appropriate response -- as with any other conversation between humans. [Q] How is this deployed with Speech Recognition? ... [Q] Can you tell me a little about the solution you have developed to provide Natural Language Understanding capabilities? [A] GyrusLogic’s Platica product is patented artificial intelligence (AI) technology built with computational linguistic models for customers, employees or any other stakeholder to enter into a fully-automated conversational dialog. ..."
>>> Natural Language Understanding, Speech, Natural Language Processing, Customer Service, Business, Applications

September 13, 2007: IBM Research Demonstrates Innovative 'Speech to Sign Language' Translation System. IBM press release via Market Wire. "IBM (NYSE: IBM) has developed an ingenious system called SiSi (Say It Sign It) that automatically converts the spoken word into British Sign Language (BSL) which is then signed by an animated digital character or avatar. SiSi brings together a number of computer technologies. A speech recognition module converts the spoken word into text, which SiSi then interprets into gestures, that are used to animate an avatar which signs in BSL. ... This project is an example of IBM's collaboration with non-commercial organisations on worthy social and business projects. The signing avatars and the award-winning technology for animating sign language from a special gesture notation were developed by the University of East Anglia and the database of signs was developed by RNID (Royal National Institute for Deaf People). ... SiSi has been developed in the UK by a research team at IBM Hursley, as part of IBM's premier global student intern programme, Extreme Blue. In the European part of the programme, 80 of the most talented students from across Europe were selected to work on 20 projects and given whatever equipment, support and assistance they required. Working for an intense 12 week period alongside IBM technical and industry leaders, they focused on innovative technology projects, such as SiSi, all of which had real business value. ... For a video demonstration of the SiSi technology, visit the following url: http://youtube.com/watch?v=RarMKnjqzZU"
>>> Machine Translation, Assisitive Technologies, Speech, Natural Language Processing, Applications, Internships (@ Resources for Students)

September 4, 2007: A high-tech helping hand for soldiers - A Lockheed Martin project could give them the tools to more easily provide reports directly from the battlefield. By Henry J. Holcomb. The Philadelphia Inquirer (philly.com). "For several years, Celeste L. Corrado has been thinking about, as she put it, 'soldiers coming back to base, tired and hungry after a long day on patrol,' to face the unpleasant but important task of filling out reports. Her team of scientists and engineers at the Lockheed Martin Advanced Technology Laboratories in Cherry Hill has come up with a way to change that scenario. Last week, they turned over a working prototype of their electronic solution to the U.S. Army Training and Doctrine Command, the architects of future warfare. ... Their working prototype is called WIRE, for Wearable Intelligent Reporting Environment. It takes mature speech-recognition technology - software that turns spoken words into documents - to the battlefield. Here's how it works. ... Instead of working with hours-old information, commanders will have fresh data for sophisticated computers and artificial intelligence - another technology being refined in the Cherry Hill labs."
>>> Military, Law Enforcement, Speech, Interfaces, Natural Language Processing, Applications

August 23, 2007: Aussies prefer robots to call centres. By Munir Kotadia. ZDNet Australia. "Australians would rather deal with a decent speech recognition system than an offshore call center agent, typically based in India or another part of Asia. Speech recognition technology has matured to a stage where it can be used to increase the efficiency of a call center and provide a better customer experience, according to research from Callcentres.net. ... [Nick] Buckle said using speech recognition provides companies with a better understanding of what their customers actually want--because it does not limit them to choosing from a set number of options."
>>> Customer Service, Speech, Applications

August 22, 2007: The Next Disruptors - Here are the 10 game-changing startups most likely to upend existing industries - and spawn new entrepreneurial opportunities. By Erick Schonfeld and Chris Morrison, Business 2.0 Magazine via CNNMoney.com. "Disruption is easy to spot - in hindsight. The railroads were always going to be better than canals and wagon trains. The telephone was bound to edge out the telegraph. ... In the following pages, we identify 10 businesses with the potential to rewrite the rules of existing industries or open up entirely new markets. ... THE DISRUPTOR: Blinkx | THE DISRUPTION: Web video search and ad insertion | THE DISRUPTED: Search engines and the TV ad business. ... Blinkx's special sauce - something even Google doesn't have - is software that can turn speech into text and count how many times a word pops up in a video. This is very useful to anyone selling targeted ads for, say, Junior Mints. Blinkx can also cluster videos together by topic."
>>> Information Retrieval, Speech, Machine Learning, Applications, Ethical & Social Implications; also see this related interview

August 22, 2007: Firm listens to shrieks of structures. By Roberto Rocha. CanWest News Service | The Leader-Post. "The deadly collapse of an overpass in Laval, Que., and of a bridge in Minneapolis perked awareness of deteriorating structures on the continent and shone a light on new techniques for detecting flaws before they become tragedies. ... Montreal-based Tisec Inc. has harnessed artificial intelligence to 'listen' to a bridge's whispers of distress when tiny, hidden cracks expand under pressure. The technique is called acoustic emission testing, and it's been around since the 1960s. ... To make the process quicker, the company 'taught' computers to ignore all the background rumbling and honking on bridges. This lets a trained inspector do the detection without the presence of a highly specialized engineer. ... Tisec has been supported by Precarn Inc., an Ottawa non-profit company that helps small firms develop and commercialize intelligent technologies."
>>> Speech, Machine Learning, Applications

August 21, 2007: Vlingo rolls out speech recognition beta for mobile phones - Startup touts its Find speech recognition software as a 'breakthrough' for mobile phone users By Grant Gross, IDG News Service | InfoWorld. "Other vendors offer speech recognition for mobile phones, but Vlingo's software doesn't limit users to a predefined list of words, like other vendors, said Vlingo CEO Dave Grannan, a former general manager at Nokia. Vlingo uses what it calls adaptive hierarchical language models to learn words, user speech patterns and accents, Grannan said. Vlingo compiles the information from users to improve the speech recognition accuracy, he said."
>>> Telecommunications, Natural Language Processing, Speech, Interfaces, Applications; also see this related article

August 2007 [issue date]: Playing It by Ear - A machine-listening system that understands three speakers at once. By Tim Hornyak. Scientific American (subscription req'd). "Japanese researchers [led by Hiroshi G. Okuno of Kyoto University] have spent five years developing a humanoid robot system that can understand and respond to simultaneous speakers. ... Such auditory powers mark a fundamental challenge in artificial intelligence - how to teach machines to pick out significant sounds amid the hubbub. This is known as the cocktail party effect...."
>>> Speech, Natural Language Processing, Robots

July 23, 2007: Bare-Bones Program Learns English and Japanese Vowels - Computer model learns vowel sounds infant-style: on the fly By JR Minkel. Scientific American Science News. "A new computer model has learned to recognize vowel categories from multiple English and Japanese speakers without 'knowing' the number of vowels it is looking for or having a complete list of sounds to analyze, according to a new report. Instead, it gradually lumps vowels into distinct groups by considering them one at a time, reminiscent of how an infant might attend to sounds. The designers of the model say it is an early step toward improved voice recognition software and a better understanding of how the infant mind comes to recognize that the voices it detects are speaking one language and not another. 'We see this work as representing a movement towards thinking about language learning as an experience-dependent process,' says James McClelland, professor of psychology at Stanford University and co-author of the report appearing online in Proceedings of the National Academy of Sciences USA."
>>> Speech, Cognitive Science

July 16, 2007: The Future of Search - The head of Google Research talks about his group's projects. By Kate Greene. Technology Review. "Peter Norvig, Google's director of research, is an expert ace at building machines that answer tough questions. An authority in programming languages and artificial intelligence, he has written an oft-cited book on AI (Artificial Intelligence: A Modern Approach), has taught at the University of California, Berkeley, and the University of Southern California, and was the head of computational sciences at NASA. In 2001, Norvig came to Google to be the director of search quality. Four years later, he became Google's director of research, overseeing about 100 researchers who investigate topics that range from networking to machine translation. Technology Review spoke with Norvig to get a hint of what we can expect from search technology in the years to come. Technology Review: What does Google Research do? ... TR: What are the outstanding problems in search? ... TR: Your expertise is in artificial intelligence. Isn't Google, at its core, an artificial-intelligence company using machine-learning algorithms to search the Web, recognize speech, and match advertising with keywords? ..."
>>> Information Retrieval, Machine Translation, Speech, Natural Language Processing, Telecommunications, Machine Learning, Interviews, Applications

June 12, 2007: More-Accurate Video Search -- Speech-recognition software could improve video search. By Kate Greene. Technology Review. "Boston-based startup EveryZing has launched a search engine that it hopes will change the way that people search for audio and video online. Formerly known as PodZinger, a podcast search engine, EveryZing is leveraging speech systems developed by technology company BBN that can convert spoken words into searchable text with about 80 percent accuracy. ... The idea of using audio transcripts to search for multimedia has been around in research labs for decades, and basic speech-recognition research dates back even earlier. Much of the seminal work occurred at BBN, MIT, Carnegie Mellon University, IBM, and SRI International. ... Using probabilistic machine learning algorithms, the system takes one minute to convert each minute of audio content into text. The second part of the technology, says [CEO Tom] Wilde, is the algorithms that process the content of the text."
>>> Speech, Natural Language Processing, Machine Learning, Information Retrieval, Applications

June 7, 2007: Are you talking to me? Speech recognition - Technology that understands human speech could be about to enter the mainstream. The Economist Technology Quarterly. "Speech recognition has taken a long time to move from the laboratory to the marketplace. Researchers at Bell Labs first developed a system that recognised numbers spoken over a telephone in 1952, but in the ensuing decades the technology has generally offered more promise than product, more science fiction than function. ... Optimistic forecasts from market-research firms also suggest that the technology is on the rise. ... An area of great interest at the moment is in that of voice-driven 'mobile search' technology, in which search terms are spoken into a mobile device rather than typed in using a tiny keyboard. ... The resulting lower cost and greater reliability mean that speech-based systems can even save companies money. Last August, for example, Lloyds TSB, a British bank, switched all of its 70m annual incoming calls over to a speech-recognition system based on technology from Nuance and Nortel, a Canadian telecoms-equipment firm. ... Another promising area is in-car use. ... There are military uses, too. ..."
>>> Speech, Telecommunication, Customer Service, Machine Translation, Natural Language Processing, Military, Transportation, Applications, The AI Effect, History, Industry Statistics

April 21, 2007: The Machine's Got Rhythm - Computers are learning to understand music and join the band. By Julie J. Rehmeyer. Science News Online (from Science News, Vol. 171, No. 16, April 21, 2007, p. 248). "Until recently, computers have had little insight into music. They've merely recorded it, stored it, and offered tools that people can use to produce or manipulate it. But now, researchers are teaching computers to recognize the basic musical elements: beat, rhythm, melody, harmony, tempo, and more. Computers with those skills are becoming musical collaborators. 'Technology is changing our sense of what music can be,' [Christopher] Raphael says. 'The effect is profound.' ... Raphael, an informatics researcher at the University of Indiana in Bloomington, compares the problem to speech recognition. 'There's been a veritable army of people who've worked on speech recognition for several decades, and [the problem] still remains open,' he says. 'Any time you deal with real data, there is a huge amount of variation that you have to understand.' ... Every year, various transcription programs go head-to-head in a competition called MIREX (Music Information Retrieval Exchange). The researchers set their programs loose on the same pieces of music and then compare results. This September, when the competition takes place in Vienna, it will for the first time include full transcriptions of polyphonic music, in which multiple notes are playing at the same time. ... Even as researchers continue to refine transcription methods, the work is spinning off remarkably useful tools. ... Some of the simplest are programs that display supertitles at the opera at just the right moment or that automatically turn the page for musicians. ... Score-alignment technology opened the door for Raphael to develop his computerized-accompaniment program. ... Raphael presented the system in Boston last July at a conference of the Association for the Advancement of Artificial Intelligence."
>>> Music, Speech, Uncertainty/Probability, Applications, AAAI-06, Proceedings of the Twenty-First National Conference on Artificial Intelligence (Intelligent Systems Demonstrations: Demonstration of Music Plus One --- A Real-Time System for Automatic Orchestral Accompaniment)

April 18, 2007: Surfing TV on the Internet - Video-search company Blinkx is offering a new, easy tool for finding full-length TV shows online. By Kate Greene. Technology Review. "[Suranga] Chandratillake believes that his company has figured out a better way to search for videos across the Web and for TV shows in particular. The Blinkx search engine uses speech-recognition technology in addition to standard metadata and surrounding text searches. For each video that the Blinkx engine encounters, it extracts audio information--strings of phonemes--that it uses to create a searchable index of words. The recognition system assembles these phonemes into words by taking into account which words typically appear in which contexts; 'sail' might appear with 'boat,' for instance. Also, the system uses all other information, from metadata to surrounding text, that provides clues as to how the phonemes fit together."
>>> Information Retrieval, Speech, Natural Language Processing, Applications

April 6, 2007: New technology lets you read your voice mail - Several companies are betting on voice-recognition applications that transcribe those rambling messages into e-mail or text messages. By Marguerite Reardon. CNET News.com. "Why listen to your voice mail messages when you can read them? That's what a new crop of companies is asking--they're developing software that turns voice mail messages into transcribed e-mail or text messages. ... One indication that voice-recognition technology is getting hot is the recent Microsoft/Tellme deal. In March, Microsoft said it would buy privately held speech-recognition maker Tellme Networks in a deal believed to be in the range of $800 million. Tellme recently started testing a cell phone application that allows people to say out loud the information they are looking for and have data sent to their phone. 'Voice is still the killer application for any phone,' said Charles Golvin, an analyst with Forrester Research. 'And it is underappreciated as an opportunity and underutilized for development of new services. Carriers can use voice applications to drive data-oriented experiences.' ... [Jill Aldort,] agrees that voice recognition services are going to be hot, especially services like the one offered by Tellme, which can help people find information on the fly."
>>> Speech, Telecommunications, Natural Language Processing, Applications

March 12, 2007: Gained in translation - New trainer will aid Army intelligence analysts. Federal Times. "The Army’s Program Executive Office for Simulation, Training and Instrumentation and General Dynamics C4 Systems are developing a simulator that allows intelligence personnel to work with virtual translators and talk with virtual people programmed to act like the real ones. The new trainer, called a human intelligence control cell (HCC), is intended to incorporate the latest advancements in speech recognition, speech synthesis and artificial intelligence. ... The computer’s speech recognition component translates the soldier’s questions to digital text, which is then processed by an artificial intelligence engine, which comes up with a response. That response is translated into the appropriate language, 'spoken' by the virtual local and 'translated' by the virtual translator."
>>> Speech, Machine Translation, Natural Language Processing, Military, Applications

March 9, 2007: Introducing the blogbot. The Engineer Online. "NEC Corporation has developed a system for automatically creating multimedia blogs by talking to a speech recognition enabled robot which helps illustrate the day’s events. ... It integrates large-vocabulary, continuous speech recognition technology, which converts the speech content into text and extracts important keywords, and natural language text retrieval technology, which enables searching of contents on the internet."
>>> Natural Language Processing, Speech, Information Retrieval, Applications

February 25, 2007: Millions of Videos, and Now a Way to Search Inside Them. By Jason Pontin. The New York Times. "[S]earch engines -- like Google -- that were developed during the first, text-based era of the Web do a poor job of searching through this rising sea of video. That’s because they don’t search the videos themselves, but rather things associated with them, including the text of a Web page, the 'metadata' that computers use to display or understand pages (like keywords or the semantic tags that describe different content), video-file suffixes (like .mpeg or .avi), or captions or subtitles. None of these methods are very satisfactory. Many Internet videos have little or obscure text, and clips often have no or misleading metadata. Modern video players do not reveal video-file suffixes, and captions and subtitles imperfectly capture the spoken words in a video. The difficulties of knowing which videos are where challenge the growth of Internet video. 'If there are going to be hundreds of millions of hours of video content online,' [Suranga Chandratillake, a co-founder of Blinkx] said, 'we need to have an efficient, scalable way to search through it.' ... Mr. Chandratillake’s solution does not reject any existing video search methods, but supplements them by transcribing the words uttered in a video, and searching them. This is an achievement: effective speech recognition is a 'nontrivial problem,' in the language of computer scientists. Blinkx’s speech-recognition technology employs neural networks and machine learning using 'hidden Markov models,' a method of statistical analysis in which the hidden characteristics of a thing are guessed from what is known. Mr. Chandratillake calls this method 'contextual search'...."
>>> Information Retrieval, Speech, Natural Language Processing, Uncertainty/Probability, Markov (@ Namesakes), Applications

February 20, 2007: Talking bathrooms; System helps Alzheimer's patients look after own hygiene. By Sheryl Ubelacker. Canadian Press / available from the Welland Tribune / also available from globeandmail.com (Artificial intelligence to help dementia sufferers | February 23, 2006). "[R]esearchers at the Toronto Rehab Institute are working on artificial intelligence systems - including a 'smart bathroom' - that they hope will one day help people with Alzheimer's and other forms of dementia live more independent lives in their own homes. 'Often when a person gets moderate to severe levels of impairment, they are taken out of their home and put into a care facility,' said lead researcher Dr. Alex Mihailidis, a mechanical and biomedical engineer at Toronto Rehab. 'We are using artificial intelligence to support aging-in-place so that people can remain in their homes for as long as possible.' ... Mihailidis and his team have also designed prototype technology to detect when a person has fallen."
>>> Assisitive Technologies, Vision, SpeechApplications

February 11, 2007: 'Intelligent' homes designed to help the elderly. CTV.ca. "Scientists in Toronto say they have are developing artificially intelligent computer systems to help elderly people suffering from memory loss stay safely in their own homes. 'Often when a person gets moderate to severe levels of impairment, they are taken out of their home and put into a care facility,' says lead scientist Dr. Alex Mihailidis, a mechanical and biomedical engineer and researcher at Toronto Rehabilitation Institute. 'We are using artificial intelligence to support aging-in-place so that people can remain in their homes for as long as possible.' ... 'Our systems are not intended to replace professional or family caregivers. However, the results from our studies are encouraging and show that the use of artificial intelligence in a home setting can provide safety and security and enhance the quality of life for older adults who would like to remain in their homes as they age,' Mihailidis said. This could also assist those giving informal care to loved ones."
>>> Assistive Technologies, Smart Houses & Rooms, Vision, Speech, Applications

February 6, 2007: Q&A: Suranga Chandratillake - A cofounder of Blinkx explains why Internet video matters and how his company can contribute to its growth. By Jason Pontin. Technology Review. "Last week, at Demo07, an annual conference that showcases new technologies and startups, Suranga Chandratillake, a cofounder and co-CTO of Blinkx (pronounced 'blinks'), was voted 'Demo God' by the show's attendees. The crowd was impressed not only by Chandratillake's intelligence, but also by Blinkx's technology, which allows users to search more than seven million hours of Internet video to find exactly the clip they want. ... Chandratillake's technique employs speech recognition, neural networks, and machine learning to create transcripts of the world's videos; then, the words spoken in the videos can be searched. The method creates much more relevant video-search results. ... TR: If video now constitutes 60 percent of Internet traffic (with some estimates saying that figure will rise to 90 percent within the decade), how much of that content is now searchable using Blinkx? Could you compare that with your competitors in video search, please? SC: Blinkx is content and source agnostic, which means that we're working to index all video content, wherever it exists on the Web, which makes us the biggest video-search engine. ..."
>>> Information Retrieval, Speech, Neural Networks, Machine Learning, Applications, Interviews

January 25, 2007: Artificially intelligent homes for Alzheimer's patients coming: scientists. CBC News. "Scientists in Toronto are developing an artificial intelligence system that would help people with Alzheimer's disease or other cognitive impairments live safely at home.The Toronto Rehabilitation Institute is working with University of Toronto researchers to make home-based computer systems that would assist elderly people with memory loss in living independently. ... [L]ead scientist Alex Mihailidis said in a written statement. 'We are using artificial intelligence to support aging-in-place so that people can remain in their homes for as long as possible.' ... The researchers have also created a home emergency alert system that uses ceiling-mounted cameras linked to computers running image analysis software to determine whether a person has fallen down. It would then ask whether he or she needs help and use a voice-recognition system to process a response. ... The researchers say they are the first in the world to test home-based artificial intelligence systems in clinical trials."
>>> Assisitive Technologies, Vision, Speech, Applications

January 23, 2007: Sentimental Journey. New computer software applications -- in the labs and in the market -- are using emotion as data input and responding to it. 'How does that make you feel?' asked the computer. By Esther Schindler. CIO. "Many science-fiction stories begin with a premise of computers gaining sentience, self-awareness, or the ability to feel -- or fake -- emotion. In these utopian (or sometimes, dystopian) stories, humanity demonstrates its underlying assumption that 'being human' means 'feeling emotion.' Yet, for business purposes, it isn't necessary for a computer to emote -- as long as it can respond to our emotions. We want companies (and the systems they build, whether silicon- or carbon-powered) to acknowledge and respect our feelings, particularly when those feelings are strongly felt. Enterprises are starting to see good dollars-and-cents reasons to take action on emotion. ... The intent isn't to create an empathic artificial intelligence that experiences emotion. In these applications, the software analyzes human behavior and helps humans to make better business decisions. ... NICE software detects emotion from both the content and audio behavior. ... Corpora's Sentiment doesn't deal with spoken words; it examines print. The software employs natural language processing to determine the 'document level author sentiment' of a text document. ...The research community HUMAINE (Human-Machine-Interaction Network on Emotion) started in 2004, intending to lay the foundation for software development to register, model and/or influence human emotional and emotion-related states and processes, which they call 'emotion-oriented systems.' Among its goals is working toward a standard markup language. In 2006, the W3C created an Emotion Incubator Group for discussing standardization. ... One example of emotion-based computing is eMoto, a joint project between the Swedish Institute of Computer Science and Stockholm University/KTH. eMoto is a mobile messaging service for sending and receiving 'affective messages'...."
>>> Customer Service, Interfaces, Speech, Natural Language Processing, Emotion, Representation, Telecommunications, Neuroscience, Cognitive Science, Systems, Applications

January 10, 2007: After Years of Effort, Voice Recognition Is Starting to Work. By Lee Gomes. The Wall Street Journal (page B1). "So maybe you won't be talking to your car anytime soon, the way Microsoft and Ford would like you to be. Odds are, though, that you are already on speaking terms with silicon, probably more than you realize. And you can expect to be chatting it up more and more. Almost since computers were invented, computer scientists have been working to get the machines to understand what people are saying to them. Until the past few years, they hadn't been successful enough to offer anything but lab demos. Now, though, computer speech recognition is sufficiently advanced that it is showing up in a surprising variety of places. Like automobiles. ... While voice-controlled computers are sci-fi staples, in practice most people find a keyboard and a mouse are fine for telling a PC what to do. Bill Meisel, a veteran observer of the speech-recognition market, says the main use of speech recognition at the moment is in specialized applications like law and medicine. Radiologists, for example, are increasingly dictating their diagnoses and observations into a speech-recognition program rather than into a tape recorder that must later be transcribed. At its core, speech recognition takes advantage of extraordinarily complex statistical methods to match the sounds you say with the right words. ... One of the biggest applications of the technology is in call centers. ... David Nahamoo, who oversees IBM's speech research, says that some other new applications are already at hand. One is a system that produces automatic translations of foreign-language broadcasts, such as those in Arabic, first by performing speech recognition of the spoken words and then by using translation software to render things in English."
>>> Speech, Natural Language Processing, Transportation, Marketing & Customer Service, Machine Translation, Assisitive Technologies, Applications

January 8, 2007: Ford-Microsoft software unveiled. BBC News. "Microsoft and Ford have unveiled a system to enable voice-activated music and telephone calls for car drivers. ... Drivers will be able to say contacts' names in English, French or Spanish, or tell the car which song they want to hear from their MP3 player."
>>> Transportation, Speech, Natural Language Processing, Applications

January 3, 2007: Hearing Machines - While hearing in machines lags far behind vision in machines, the potential is great, and researchers are beginning to make impressive progress. By Paris Smaragdis. Technology Review. "Understanding how we perceive the world, and using that knowledge to make machines that can mimic us, has been an ongoing and exciting scientific quest. Vision has had the lion's share of attention in the field. Our understanding of image structure and form is well developed. The development of machine learning and artificial intelligence (AI) has immensely benefited from--and has been immensely influenced by--vision problems. And we all understand why computers and ATM machines come equipped with cameras nowadays. The rest of the senses have not been investigated as much as vision has. Having a machine exhibit hearing is not something that people think about. Sure, computers can (sort of) recognize speech, but is that all hearing is good for? Surely we do more with our ears than just hear other people talk. ... Machines can do their own set of valuable hearing tasks. They can listen for survivors in a collapsed building's rubble; they can help soldiers locate who shot at them; they can listen for breathing problems in patients in intensive care; and they can try to filter out that annoying neighbor who loves to sing really loudly in the shower."
>>> Speech, Music, Natural Language Processing, Applications

December 19, 2006: Pluggd: A Google for Podcasts. By Eliot Van Buskirk. Wired News. "Pluggd has found a way to index podcasts, talk shows and other spoken-word content. The company's service then allows users to search the audio files for specific words. ... Acoustic speech recognition technology has been around for years -- companies such as Podzinger, blinkx and Podscope have used it to build similar products -- but it's only part of the Pluggd picture. To understand Pluggd's potential, you need a quick peek under the hood. ..."
>>> Information Retrieval, Speech, Ontologies, Applications, Natural Language Processing

December 5, 2006: Nuance enhances mobile speech recognition. By Liz Tay. PC World. "To enable 'natural language processing', Nuance's speech recognition technology analyses a collection of utterances from local call centre data in Australia, the US, and the UK. Using statistical and semantic language modeling, the system compares the data against what is said, to decide on the most probable function the user is trying to perform. 'It's semi-artificial intelligence,' Chidiac said. 'You can say exactly what you want, and the system will route you to the right area, or the right person and so on. You don't have to go through the various prompts.'"
>>> Natural Language Processing, Speech, Telecommunication, Applications

November 27, 2006: Smart Spaces - If These Walls Could Talk. They may do that and more if the promise of smart spaces is ever realized. The technology is available, but cost and other factors remain obstacles. By Gary Anthes. Computerworld. "It’s fun to think about these scenarios, but we rarely encounter them in the real world. Who besides Bill Gates lives in an environment in which IT senses and responds to the behavior of the people in it? Your PC knows you haven’t touched it for 30 minutes, so it turns on the screensaver. That’s about it. Yet the technology to make our environments smarter and more responsive to our needs largely exists. Sensors of all types, actuators, radio frequency identification (RFID) tags, large touch-screen displays, digital cameras, personal software agents, machine-learning algorithms, voice- and image-recognition software, even robots these aren’t just the dreams of science fiction writers anymore. The impediments to the widespread deployment of smart spaces lie elsewhere -- in the form of problems related to cost, interoperability, accuracy and reliability. And there are the social and cultural challenges. ... Still, progress is being made. For example, researchers at Stanford University have invented a collaboration space called the iRoom, or 'interactive room.'"
>>> Smart Rooms, Machine Learning, Vision, Speech, Ethical & Social Implications, Systems, Interfaces, Applications

November 19, 2006: Quest for last word in search. By Paul Durman. Sunday Times Online. "Paul Durman visited Google’s HQ in California to hear how we will one day be able to find the answers we need just by asking our mobile phone. ... Peter Norvig, former head of computational sciences at Nasa with a liking for colourful shirts, is one of the world’s leading authorities on artificial intelligence. He used to think he was working in a dry, technical area -- but then it turned out he was in the 'search' business. 'It’s amazing,' said Norvig, now director of research at Google, where he has been for the past five-and-a-half years. 'I came here because I thought it was an interesting place from a technical point of view. I was working in this very narrow, specific domain. Now it’s this thing that everybody in the world knows about.' ... 'We are at the very beginning of search,' said Norvig. He said his colleagues were 'disappointed' that most searches still start by typing a couple of words into a box on a web page. ... Google is also working on speech-recognition technology so that, within a few years, you will simply be able to 'tell' your mobile phone what you are looking for, and Google will go off and find it."
>>> Information Retrieval, Natural Language Processing, Speech, Applications

November 18, 2006: Big brother is listening to you. New Scientist (Issue 2578). "[S]urveillance cameras in the city of Groningen have been adapted to listen out for voices raised in anger. Microphones attached to the cameras feed the sound signals to software that can detect voices that are aggressive in tone."
>>> Law Enforcement, Speech, Vision, Applications

November 14, 2006: 'Audio telescope' could save planes from birds. By Tom Simonite. NewScientist.com news. " ... Stanford's research usually focuses on so-called 'smart spaces', in which people are identified and tracked using cameras and microphones. Software used to identify individuals by voice has been modified by Stanford, NIST colleagues and researchers from US firm Intelligent Automation, also in Maryland, US, so that it can distinguish between bird species instead."
>>> Speech, Applications, Smart Houses & Rooms

November 13, 2006: Tech Solutions to Iraqi-U.S. Language Barrier. Xeni Jardin's Xeni Tech report for NPR's Day to Day. [Audio available.] "Part of the daily struggle for soldiers and Marines in Iraq is communicating with civilians and suspected insurgents. Few military personnel have enough fluency with Iraqi Arabic to be easily understood, and field translators are in short supply. But technology may help close that communications gap. A hand-held voice translator device developed by Integrated Wave Technologies, already in use in other parts of the world, converts simple English commands into Iraqi Arabic or 15 other languages."
>>> Machine Translation, Speech, Military, Natural Language Processing, Applications

November 5, 2006: Top minds taxed by translation challenge - Creating a real-time translating machine is harder than it seems. By Brian Bergstein. The Associated Press /available from MSNBC.com. "The past few years have shown that U.S. government intelligence goes only so far. One of the biggest challenges is recognizing vital information in foreign languages -- and acting quickly on it. That's why the military would love software that can listen to TV broadcasts or phone conversations and read Web sites in Arabic and Chinese, translate them into English and summarize the key elements for humans. ... Last year DARPA launched a project that aims to create that real-time translation software. It’s called GALE, for Global Autonomous Language Exploitation. And on top of GALE’s technical challenges, DARPA added some twists. It hired three teams of researchers to chase the problem for up to five years. Each year, their progress would be evaluated, and the worst-performing team could be eliminated. Or the program could be shut down entirely. ... DARPA wants translations not only from such controlled, well-articulated sources. GALE incorporates man-on-the-street interviews and raucous colloquial chats on the Web. That’s where things get tricky. Background noise, dialects, accents, slang, short words like 'on' or 'of' that most speakers don’t bother to clearly enunciate -- these are the stuff of nightmares for speech-recognition and machine-translation engineers."
>>> Machine Translation, Speech, Military, Natural Language Processing, Applications

October 26, 2006: It's the next best thing to a Babel fish. By Celeste Biever. New Scientist (Issue 2575: page 32). "Imagine mouthing a phrase in English, only for the words to come out in Spanish. That is the promise of a device that will make anyone appear bilingual, by translating unvoiced words into synthetic speech in another language. The device uses electrodes attached to the face and neck to detect and interpret the unique patterns of electrical signals sent to facial muscles and the tongue as the person mouths words. The effect is like the real-life equivalent of watching a television show that has been dubbed into a foreign language, says speech researcher Tanja Schultz of Carnegie Mellon University in Pittsburgh, Pennsylvania. ... Their secret is to detect not just words but also the phonemes that form the building blocks of words. ... The researchers use software that has been taught to recognise which phonemes are most likely to appear next to each other and in what order."

>>> Machine Translation, Speech, Natural Language Processing, Applications

October 24, 2006: Computer Beats Fastest Text Messenger. By Travis Reed. The Associated Press / available from Examiner.com. "Ben Cook's fingers flurried so fast you couldn't see what he was doing until he had done it. But when the cell-phone screens cleared, the world's fastest text messenger was handed his first head-to-head defeat Tuesday: a voice-recognition computer had bested his record time on a complicated 27-word message. ... [Michael] Thompson couldn't say how much the service would cost consumers because it will likely vary by carrier. He said it'll be available in some new phones, but existing phones can download software for use as well. Nuance envisions it as a tool for drivers and others who want to send text messages, instead of calling or leaving a voice mail, but don't have time to sit and type."
>>> Speech, Natural Language Processing, Applications

October 11, 2006: Interview with W. Lewis Johnson, Founder of Alelo. By Ben Kuo. socalTECH.com. "We recently ran across Alelo (www.alelo.com), a startup and University of Southern California spinout developing interactive games used by the military. The company's very engaging, interactive 3D role playing games teach languages like Arabic and Pashto to troops being deployed to the Middle East. Using speech recognition and other technology, the titles teach foreign languages to players as they go through the game in simulated environments like Iraq. We spoke with Dr. W. Lewis Johnson, CEO of Alelo, about the firm's technology and plans. Ben Kuo: Tell us a little bit about Alelo, and what the company and product does? W. Lewis Johnson: Our overall company is called Alelo, and we also have a government subsidiary called Tactical Language Training LLC which develops projects for the government. We develop interactive products for teaching foreign languages and cultures using video game and artificial intelligence technology. We're a spinoff of the University of Southern California, where I am a research professor in Computer Science. ... Ben Kuo: Where's the idea come from for using games to teach a language? W. Lewis Johnson: I have been doing research for several years in what we call pedagogical agents - animated characters that can act as guides and tutors.  ... Ben Kuo: Can you talk about the user experience with the language recognition? I was impressed by the fact that the software recognizes what you say in Arabic. ..."
>>> Intelligent Tutoring Systems, Video Games, Agents, Speech, Natural Language Processing, Interfaces, Education, Military, Applications, Interviews

September 29, 2006: Two tickets? Certainly sir. By Anna Salleh. ABC Science Online. "Talking, thinking cartoon faces that understand our needs may one day improve our chances of getting what we want from computers, say experts. The animated faces may replace the mouse, keyboard and the touch screen as the main way we interact with computers, say when booking tickets or withdrawing money from the bank. Professor Denis Burnham, director of the MARCS Auditory Laboratories at the University of Western Sydney, says a talking, thinking head could be available in the next 10 years. ... Burnham is heading up a new A$3.4 million (US$2.5 million) project, funded by the Australian Research Council and the National Health and Medical Research Council, to develop a computer-generated head that emulates face-to-face conversation. Burnham's team will use technologies such as computer animation, speech recognition and computer-generated dialogue to construct the talking, thinking head. And the researchers will use cognitive science to evaluate and improve how well it communicates. ... Burnham hopes his team's talking, thinking head will make our conversations with artifically intelligent machines more effective for two reasons. ..."
>>> Speech, Natural Language Processing, Interfaces, Cognitive Science, Applications

September 27, 2006: HAL, the Computer: Episode 1859 in The Engines of Our Ingenuity radio series. Written & hosted by John Lienhard. Produced by KUHF-FM Houston. "Today, our guest, scientist Andrew Boyd discusses a legendary movie figure. The University of Houston presents this series about the machines that make our civilization run, and the people whose ingenuity created them."
>>> Science Fiction, Speech, Natural Language Processing

September 14, 2006: Surfing the Internet for Spoken Words - New Technology Allows Searchers to Scour Online Audio, Video to Target Advertising. By William M. Bulkeley. The Wall Street Journal. "One of the charms of Internet video and audio is that Web sites featuring such offerings are largely free of the advertising cluttering television and radio. That may be about to change. Several small companies are starting to pitch advertising links using their software that will search every word spoken in Web-borne video soundtracks or Internet audio programs known as podcasts. The new technology, from companies including Podzinger Inc., TVEyes Inc. and Blinkx Inc., uses voice-recognition software to translate spoken words into text or audio-wave forms that can then be searched. ... Podzinger, of Cambridge, Mass., which provides audio search on its site and for some partners, says the ability to find words in videos fills a huge gap. 'Audio and video have been a black space that cannot be discovered by traditional search engines,' says Alex Laats, Podzinger's chief executive."
>>> Speech, Information Retrieval, Applications

September 13, 2006: New Call Center App Connects Data, Agents. By Erika Morphy. CRM Buyer. "A new product from eTalk, Qfiniti Assist, provides contact center agents with knowledge management support while they are on the line with a customer. The tool combines speech recognition and enterprise Quintum VoIP solutions. The perfect fit for your Enterprise. search technology to identify the information the agent needs in order to answer a client's question. It then automatically routes that information to the agent's desktop during the call."
>>> Speech, Customer Service, Knowledge Management, Information Retrieval, Applications

August 24, 2006: Googling Your TV - Prototype software from Google Research could listen to your TV and send back useful information -- and ads of course. By Wade Roush. Technology Review. "A system recently outlined by researchers at Google amounts to personalized TV without the fancy set-top equipment required by previous (and failed) attempts at interactive television. Their prototype software, detailed in a conference presentation in Europe last June, uses a computer's built-in microphone to listen to the sounds in a room. It then filters each five-second snippet of sound to pick out audio from a TV, reduces the snippet to a digital 'fingerprint,' searches an Internet server for a matching fingerprint from a pre-recorded show, and, if it finds a match, displays ads, chat rooms, or other information related to that snippet on the user's computer. ... When word of the research first appeared in the media, some bloggers and other technology watchers reacted with horror; many assumed that the background conversation picked up by the microphone in Google's system would be uploaded to Google. But the technology makes it impractical; at four bytes, the fingerprints don't contain enough information to reconstruct the original sounds in a room. 'Some people did get the impression that we had an open microphone that was going to listen in on them,' says [Peter] Norvig. 'Clearly, that was not what we were doing. We are transmitting a key that can be matched but not reversed. That said, users are giving up some information -- and that's something they have to decide about.'"
>>> Marketing, Speech, Vision, Applications

August 18, 2006: Spying an intelligent search engine - Innovation in Web search using artificial intelligence may lead to the day when you could expect the Web to do the tedious tasks for you. By Stefanie Olsen. CNET News.com. "Search is like oxygen for many people now, and considering Google's breakthroughs in Web document analysis, supercomputing and Internet advertising, it can be easy to think this is as good as it gets. But some entrepreneurs in artificial intelligence (AI) say that Google is not the end of history. Rather, its techniques are a baseline of where we're headed next. For example, one day people will be able to search for the plot of a novel, or list all the politicians who said something negative about the environment in the last five years, or find out where to buy an umbrella just spotted on the street. Techniques in AI such as natural language, object recognition and statistical machine learning will begin to stoke the imagination of Web searchers once again. 'This is the beginning for the Web being at work for you in a smart way, and taking on the tedious tasks for you,' said Alain Rappaport, CEO and founder of Medstory, a search engine for medical information that went into public beta in July. 'The Web and the amount of information is growing at such a pace that it's an imperative to build an intelligent system that leverages knowledge and exploits it efficiently for people,' he added. ... Rappaport said one of the more recent progressions in AI has been in moving from relying on humans to catalog connections between various data to programming computers to do the work, or what he calls the automation of knowledge structure. Tom Mitchell, chair of machine learning at Carnegie Mellon University calls it machine learning for statistical language processing, or learning algorithms that allow computers to read text. ... Technologies like speech recognition will fuel advances. ... The field of AI called computer vision, which encompasses facial detection and recognition, is coming of age for several reasons."
>>> Information Retrieval, Natural Language Processing, Machine Learning, Vision, Speech, Applications

August 14, 2006: A Sentinel to Screen Phone Calls - New software could block voice spam. By Duncan Graham-Rowe. Technology Review. "A system for automatically screening phone calls has been developed by researchers at Microsoft. It works by analyzing characteristics of a caller's voice and word usage to figure out how urgent a call is and whether the caller is a friend, family member, colleague, or stranger. Then the call can be either put through or sent to voice mail. Called V-Priorities, the system was originally part of a larger effort to ensure that urgent calls get through when an individual is busy or in a meeting. But, according to Eric Horvitz, a senior researcher at Microsoft Research in Redmond, WA, who created the system, it could also prove to be useful for filtering the growing number of spam phone calls. In preliminary tests, the prototype system was 90 percent accurate at judging whether or not calls were unsolicited, says Horvitz. Similarly, its ability to judge the personal 'closeness' of the caller was 84 percent accurate, while it could distinguish business calls from personal calls 75 percent of the time."
>>> Speech, Natural Language Processing, Machine Learning, Telecommunications, Filtering, Interfaces, Applications

August 8, 2006: Smarter call centre automation for public administrations. IST Results. "Residents of Barcelona, Turin and the Camden district in London will soon be talking to a computer when they call their local council for information or carry out a phone transaction. But instead of receiving scripted responses they will be conversing with a state-of-the-art platform that enables natural dialogue. Developed by the IST-funded HOPS project, the platform uses a variety of technologies to enable people to talk to a computer over the phone as if they are talking with a human call centre worker. ... The HOPS project has managed to make human-machine dialogue more natural and fluid by merging voice technologies such as Automatic Speech Recognition (ASR) and Text to Speech (TTS) with natural language processing technologies to understand, interpret and respond to callers. These components are then tied into a data management system incorporating Semantic Web technology for finding and extracting the information sought by users."
>>> Natural Language Processing, Speech, Customer Service, Knowledge Management, Applications

July 20, 2006: Like Having a Secretary in Your PC. By David Pogue. The New York Times. "The software I’m using is Dragon NaturallySpeaking 9.0 (www.nuance.com), the latest version of the best-selling speech-recognition software for Windows. This software, which made its debut Tuesday, is remarkable for two reasons. Reason 1: You don’t have to train this software. ... The software instantly corrects the word, learns from its mistake and deposits your blinking insertion point back at the point where you stopped dictating, ready for more."
>>> Speech, Natural Language Processing, Applications

June 12, 2006: Talking PCs? Talk to the hand. By Nick Hampshire. ZDNet UK. "Voice recognition and speech synthesis technologies may not have developed to the degree some science fiction writers hoped, but have nevertheless seen some startling successes. ... Voice synthesis has been around for a long time. Bell Labs demonstrated a computer-based speech synthesis system running on an IBM704 in 1961, a demonstration seen by the author Arthur C. Clarke, giving him the inspiration for the talking computer HAL9000 in his book and film '2001: A Space Odyssey'. Forty-five years later, voice synthesis technology can be found in products as diverse as talking dolls, car information systems and various text-to-speech conversion services such as the one recently launched by BT. Many of these modern systems can convert text into a computer synthesised voice of quite respectable quality. ... Voice recognition has turned out to be a much harder task than researchers realised when work began on the problem over forty years ago. However, limited voice recognition applications are starting to creep into everyday use, voice input telephone menu systems are now commonplace, speech-to-text dictaphones are increasingly used for note-taking by doctors and lawyers, and voice input has started to appear in computer games systems. The success of some of these limited-application voice recognition systems has recently prompted the big software heavyweights, Microsoft and IBM, to make further investments. ... However, there are still a lot of technological hurdles to overcome; to understand what these are, we need to delve further into the technology. ... Speech recognition - Speech recognition, on the other hand, is a much harder task, and commercial off-the-shelf systems have only been available since the 1990s. Because every person's voice is different, and words can be spoken in a range of different nuances, tones and emotions, the computational task of successfully recognising spoken words is considerable, and has been the subject of many years of continuing research work around the world. A variety of different approaches are used, dynamic algorithms, neural networks, and knowledge bases, with the most widely used underlying technology being the Hidden Markov Model. These techniques all attempt to search for the most likely word sequence given the fact that the acoustic signal will also contain a lot of background noise."
>>> Speech, Natural Language Processing, Uncertainty & Probability, Machine Learning, Applications, Telecommunications, Science Fiction

June 2, 2006: Baby Talk and Monkey Talk. By Victor D. Chase. ScienceCareers.org. "How does an infant learn language? Once the basics of a language are learned, there are rules that someone familiar with language instinctively applies, but a baby coming to language with no previous knowledge cannot apply such rules. So how does it learn? 'It’s a big question,' says Jessica Maye, an assistant professor of linguistics at Northwestern University in Evanston, Illinois, 'because it takes money to make money. If you know something about language, you can use that to learn something else about language, but how do you learn the very first thing?'... Early in her graduate school career at the University of Arizona, Tucson, Maye decided she wanted to focus on psycholinguistics, a relatively new branch of linguistics that draws on cognitive sciences, including psychology, computer science, artificial intelligence, speech and hearing, and neural imaging to explain how humans learn language. After receiving her Ph.D. in 2000, Maye spent 3 years as a postdoctoral fellow in brain and cognitive sciences at the University of Rochester in New York, where she began conducting experiments on how babies learn. ... The goal of the NSF-funded project is to determine why infants are so much more proficient than adults and monkeys. 'If we know more about how infants use the statistical information they have access to, we can incorporate that information into machine processing of language to better enable speech recognition to correctly process language as humans do,' says Matt Goldrick, an assistant professor in Northwestern’s Linguistics Department (which is separate from the Department of Communication Sciences and Disorders where Maye works). ... Apart from the implications about language therapy, there's another practical side to Maye's research. Computer scientists aiming to improve computer speech recognition need to enhance the ability of machines to achieve 'source separation,' the ability to distinguish and make sense of several voices talking at the same time even in the presence of background noise (although a phenomenon Maye has not researched herself). ... Another thing humans can do very well but machines cannot is find the edges of words."
>>> Cognitive Science, Natural Language Processing, Speech, Careers in AI (@ Resources for Students)

May 29, 2006: Military getting high-tech help from SRI lab - New system can recognize words, understand simple foreign phrases. By Tom Abate. San Francisco Chronicle & SFGate.com. "During a recent product demonstration at SRI headquarters in Menlo Park, computer scientist Harry Bratt spoke into the microphone of his lab's new translation computer: 'Did you hear the explosion this morning?' Several seconds later, software written by SRI International scientists piped the question through the computer's speaker -- this time in the Iraqi dialect of Arabic. Saad Alabbodi, an Iraqi immigrant posing as a civilian being questioned by a U.S. soldier, answered in his native tongue. There was another pause as the computer translated Alabbodi's reply into English in a mock interrogation that provided another example of how technology is slowly mimicking complex human capabilities such as speech. [Go to the related podcast to hear the actual conversation.] ... 'One of the crying needs in Iraq is overcoming the language barrier,' said Kristin Precoda, director the SRI lab that developed the two-way translation system called IraqComm."
>>> Machine Translation, Military, Natural Language Processing, Speech, Applications

May 22, 2006: Hearing aids, your latest accessory. By Mark Baard. The Boston Globe & boston.com. "The Delta [hearing aid] uses artificial intelligence to help clarify high-pitched sounds in noisy situations. Active boomers are going to need that kind of help to keep up with the pack in work and social situations, said Don Shum, vice president of Audiology & Professional Relations at Oticon."
>>> Speech, Assisitive Technologies, Applications

May 10, 2006: Website helps users develop reading skills. USAToday.com. "IBM will give schools, libraries and community centers free access to a new website that allows young children and adults with limited English to practice reading aloud, the company announced Wednesday. The website uses newly developed speech-recognition software that 'listens' to readers and helps correct errors. The Reading Companion program...."
>>> Speech, Intelligent Tutoring Systems, Education, Applications

May 4, 2006: Open the pod bay doors, HAL. New Scientist (Issue 2550; page 27). "HAL 9000, the chatty computer from 2001: A Space Odyssey, has come a step closer to reality. A team crewing NASA's Mars Desert Research Station, a simulated planetary environment in the Utah desert, has been experimenting this week with software that can talk to the crew about the status of their spacecraft's systems. Using wireless headsets, crew members ask the computer questions ...."
>>> Natural Language Processing, Speech, Interfaces, Science Fiction, Space Exploration

April 26, 2006: Software lets programmers code hands-free. By Duncan Graham-Rowe. NewScientist.com News. "Standard speech recognition software can be used to control a computer but is usually of little help to programmers, says Alain Désilets of the National Research Council of Canada in Ottawa, who created VoiceCode. This is because each symbol and function and every syntactic peculiarity must be carefully spelled out. VoiceCode lets a programmer dictate code in a more natural way, Désilets says, rapidly translating their utterances into awkward programming syntax."
>>> Speech, Interfaces, Automatic Programming

April 24, 2006: Model hearing. The Engineer Online. "Robots may one day be equipped with the advanced listening skills of human beings if a team of UK researchers succeeds in its attempt to develop a complex computer model of the part of the brain that processes sound. Dr Adrian Rees, who is leading the project at Newcastle University's department of Neurology, Neurobiology and Psychiatry, told The Engineer that his group is working on the development of a computer model of the auditory midbrain - or inferior colliculus. ... Rees said the group plans to develop a biologically realistic computerised model of this auditory pathway, then adapt the system to control an experimental robot that will be able to respond to different sound stimuli in a noisy environment. ... It could, for instance, be used to enable voice control of machines in noisy conditions, or be used at the heart of a new generation of sophisticated hearing aids that more accurately replicate the way in which humans process sound."
>>> Speech, Robots, Assisitive Technologies

April 24, 2006: Soon that cell will be all ears - Voice-recognition technology will cut out the wait time. By Ryan Kim. San Francisco Chronicle & SFGate.com. "A handful of companies is creating solutions that allow users to find what they're looking for by utilizing an already familiar cell phone behavior: speaking into the phone. Companies are creating voice recognition applications that allow users to say a search term into the phone and be rewarded in seconds with results. ... The advances mean voice-search services can offer results in a couple of seconds with more than 90 percent accuracy. It saves people the time of scrolling through screen after screen of results. For many carriers, this technological advance could unlock revenue that is currently bottled up by the byzantine search format of cell phones. ... A study by ChangingWorlds, an Irish provider of artificial intelligence products, and Mobile Metrix, a Swedish research firm, found last year that almost two-thirds of mobile content was more than 12 clicks away. ... ... [R]ecently, voice-recognition companies have made strides in expanding the available search terms, speeding up the process and adding intuitive and predictive functions that allow computer servers to better understand what consumers are asking for. 'Voice isn't just for speaking. It's now an interface, and it's driving commands,' [Craig Hagopian said.]"
>>> Telecommunications, Speech, Interfaces, Applications

April 13, 2006: 'Hello computer, I wish to check my account' - Call centres of the future may no longer be based in other countries, but staffed by an artificial intelligence instead. By Alan Bunce. icBerkshire. "You may be used to your conversations with call centres being recorded for training purposes but soon a computer may be monitoring what you say as well. Reading businessman Stan Bembenek's firm, Electronics 2000 (E2000), is at the forefront of developing speech recognition technology that can hold conversations with you, monitor speech for key words and even recognise who is speaking. ... David Connolly, who oversees E2000's marketing, said: 'If you are a financial institution dealing with 2,000 calls a day and you want to listen for anybody who is angry or irrational and they get abusive down the telephone, you can redirect the call if that is the issue. The other side of it is you might be sat in a centre and a caller suddenly says, "I am going on holiday next week". For financial institutions which want to sell insurance, a box could pop up on the worker's screen which says "sell him travel insurance". A person may forget to ask.' A crucial advance is that for the first time monitoring is done in real time, rather than afterwards. ... But E2000 must encourage a cultural shift among users. Just as people took time to accept voice-mail, many will be reluctant to talk to machines."
>>> Marketing, Customer Relations/Service & E-Commerce, Speech, Natural Language Processing, Applications

April 12, 2006: Google patent points to voice search. By Candace Lombardi. CNET News.com. "A recently published patent provides further evidence that Google is developing a voice-activated search engine. Patent No. 7027987, of which Google is the assignee, concerns 'a voice interface for search engines. Through the use of a language model, phonetic dictionary and acoustic models, a server generates an n-best hypothesis list or word graph.' ... The obvious eventual application is for cell phones. ... Google isn't the only one to do work in this area. Another voice interface was released in May of 2005 by a team led by Dr. Meirav Taieb-Maimon of the Department of Information Systems at Ben-Gurion University of the Negev in Israel."
>>> Information Retrieval, Telecommunications, Speech, Natural Language Processing, Applications

April 11, 1006: Troops Learn to Not Offend. By Gretchen Cuda. Wired News. "A seemingly harmless gesture could get a soldier in hot water, especially in a war-torn country. Body language that's meaningless in the United States -- such as showing the soles of one's feet -- is offensive in Iraq. So the American military is adopting a new video game created to help soldiers navigate the mysterious world of international nonverbal language. Developed by the University of Southern California's Information Sciences Institute, the Tactical Language Training Program is different from interactive language programs of the past, which focus solely on spoken language. In Tactical Iraqi, players navigate a set of real-life scenarios by learning a set of Arabic phrases, culturally relevant gestures and taboos. Other titles include Tactical Levantine and Tactical Pashto. Following each lesson, the player is asked to interact with other characters using speech and gestures, while a speech-recognition system records and evaluates the responses. Accurate responses allow the soldier to build a rapport with other characters and advance to the next level."
>>> Intelligent Tutoring Systems, Military, Video Games, Education, Speech, Applications

March 14, 2006: Speech Recognition Comes of Age. By Elizabeth Millard. NewsFactor Network's NewsFactor Magazine Online. "Although speech recognition by computers has been around for decades, to most people it remains the stuff of science fiction. Think of HAL, the all-controlling yet conversant machine from 2001: A Space Odyssey. In the real world, speech-recognition technology has not become so ubiquitous as to replace the keyboard completely. But that day might come sooner than you think. The past five years have seen great strides in the field. Software developers are looking more and more at the potential of speech recognition in applications related to entertainment, health, business, and security. Already, speech recognition is making significant inroads in places like the court system, where better technology has led to more-accurate transcriptions. One up-and-coming trend is speech analysis that can relate to emotion and articulation. Programs are being designed to tweeze out job interviewees and romantic contenders on the basis of how they talk rather than on what they type. ... The combination of improved accuracy and natural-language processing makes it possible to design applications that can dig deep into other databases and retrieve information faster than a human might. 'With this type of processing, the system can understand that when you string words together, there's meaning beyond just the individual words,' said Tim Kraskey, vice president of Spanlink Communications, a company that develops customer-interaction tools for call centers. 'But beyond that, it can use that meaning and route the caller or the agent to relevant information.'"
>>> Speech, Natural Language Processing, Customer Service, Business, Information Retrieval, Applications

March 12, 2006: Q&A: Nuance’s Paul Ricci Head of speech technology leader discusses improvements in the voice business and his acquisition strategy. Red Herring. "Q: What kinds of progress have you seen in speech technology? A: Speech technology has improved in several ways over the last few years. The general accuracy rate has improved. That’s the single most important variable. ... The capacity of the systems to understand more free-form language and queries has also improved, which we refer to as natural language processing. In call centers, people can’t be expected to ask questions in a structured way. Enterprises deploying call centers for speech find it more advantageous. The third area is dialogue management.... Q: Is the industry moving more toward natural language processing? A: Our leading customers in enterprise solutions are deploying these natural language tools today. There is going to be a steady trajectory in the enterprise. ... Q: Do you see uses for defense, such as the stories we hear about using speech recognition during wiretaps to mine for data? ... Q: Has the use of speech technology in automobiles advanced much? It seemed like there was a lot of hype a few years ago about telematics, but many companies in that area seem to have disappeared. ... "
>>> Speech, Natural Language Processing, Customer Service, Business, Law Enforcement, Transportation, Applications, Interviews

March 6, 2006: Advancements in hearing devices highlighted. By Jennifer Walker. Maple Creek News. "Karla Rissling of the Medicine Hat Hearing Centre was the guest speaker at the Canadian Hard of Hearing Association Meeting, Maple Creek Branch on February 25th. She is Board Certified in Hearing Instrument Sciences and is a Registered Hearing Aid Practitioner. Rissling spoke about the new features of advanced digital hearing instruments. The new artificial intelligence in hearing devices provides the wearer with increased speech enhancement abilities. The speech enhancement features help to suppress background noise. The artificial intelligence also optimizes speech for specific preferences according to personality of the user."
>>> Speech, Assistive Technologies, Applications

February 17, 2006: Learn Arab Language, Culture. By Stefan Lovgren. National Geographic News. "Researchers have developed an interactive computer system that uses artificial intelligence and gaming techniques to teach Arabic to U.S. soldiers. Soldier-students equipped with microphones navigate through an Arabic-speaking environment on a computer screen. If they successfully phrase questions and understand the answers, they can move on to the next level of the game. ... The characters that users face in the game, meanwhile, are animated by artificial intelligence. ... 'Language without any context is hard to learn,' said [Högni] Vilhjálmsson, a research scientist with the Information Sciences Institute at the University of Southern California in Los Angeles. 'But if you put it into the context of face-to-face communication, allowing for gestures and other non-verbal behavior, it becomes easier [to learn].' ... In each mission, students must complete an overall task focused on civil affairs and reconstruction efforts. In the Afghan version, the task is to rebuild a clinic in a remote village. ... A speech recognition system allows the speaker to communicate with characters on the screen. ... Users can gauge how well they are doing by the reactions of the other characters."

  • Also see:
    • February 19, 2006: US troops taught Iraqi gestures. By Paul Rincon. BBC News. "The US military has funded a computer game to teach its troops how to use and decipher Iraqi body language. The purpose is to teach soldiers that using the wrong gestures can potentially cause offence and escalate already tense situations. ... Tactical Iraqi is built on top of the game engine for Unreal Tournament, a first-person computer 'shoot-em-up'. In the training tool, though, subjects use communication to resolve situations."
    • February 18, 2006: Gestures can help soldiers, students alike. By Andrew Bridges. Associated Press / available from MercuryNews.com / also available from BostonHerald.com. "The study of gestures is relatively old, said Justine Cassell, a professor of media technology and society at Northwestern University. What is new is the acceptance that gestures themselves can tell something about how humans think. In Iraq, Afghanistan and elsewhere in the Middle East, that knowledge has found its way into a video game and training program that the Pentagon is using to give soldiers a crash course in how to speak and gesture like locals. ... The simulators are equal parts tutorial and video game. They place soldiers in simulated, three-dimensional Middle Eastern environments and expose them a variety of social situations. The soldiers can interact with local residents, including curious boys, shy women and suspicious men, after learning basic language and gesturing skills. The 'natives' are all independently endowed with artificial intelligence and react according to how well or poorly a soldier handles a situation."

>>> Intelligent Tutoring Systems, Video Games, Military, Education, Speech, Applications

February 5, 2006: Surveillance Net Yields Few Suspects - NSA's Hunt for Terrorists Scrutinizes Thousands of Americans, but Most Are Later Cleared. By Barton Gellman, Dafna Linzer and Carol D. Leonnig. The Washington Post (page A01) & washingtonpost.com. "Surveillance takes place in several stages, officials said, the earliest by machine. Computer-controlled systems collect and sift basic information about hundreds of thousands of faxes, e-mails and telephone calls into and out of the United States before selecting the ones for scrutiny by human eyes and ears. Successive stages of filtering grow more intrusive as artificial intelligence systems rank voice and data traffic in order of likeliest interest to human analysts. ... Supporters speaking unofficially said the program is designed to warn of unexpected threats, and they argued that success cannot be measured by the number of suspects it confirms. ... Contributors to the technology said it is a triumph for artificial intelligence if a fraction of 1 percent of the computer-flagged conversations guide human analysts to meaningful leads. ... Even with 38,000 employees, the NSA is incapable of translating, transcribing and analyzing more than a fraction of the conversations it intercepts. For years, including in public testimony by Hayden, the agency has acknowledged use of automated equipment to analyze the contents and guide analysts to the most important ones. ... An alternative approach, in which a knowledgeable source said the NSA's work parallels academic and commercial counterparts, relies on 'decomposing an audio signal' to find qualities useful to pattern analysis. Among the fields involved are acoustic engineering, behavioral psychology and computational linguistics. A published report for the Defense Advanced Research Projects Agency said machines can easily determine the sex, approximate age and social class of a speaker. They are also learning to look for clues to deceptive intent in the words and 'paralinguistic' features of a conversation, such as pitch, tone, cadence and latency."
>>> Law Enforcement, Pattern Recognition, Data Mining, Machine Learning, Natural Language Processing, Speech, Ethical & Social Implications, Applications

January 23, 2006: Talk to the car with new tech. By Stefanie Olsen, with Michael Kanellos contributing. CNET News.com. "[I]t may be a few years before mass-production vehicles synchronize electronic devices for voice control, but momentum is building for features that let people ask for driving directions or call a friend without using their hands. Last week, for example, Toyota partnered with a relatively unknown voice-search specialist, called VoiceBox, in Bellevue, Wash. In development for roughly three years, VoiceBox's technology differs from established voice tech on the market because it allows people to speak conversationally to operate car electronics, rather than having them memorize and deliberately sound out commands. ... [VoiceBox's founder, Bob Kennewick] recognized a fundamental problem with existing voice recognition, which required programmers to set up specific dictionaries for a given set of data, and then match speech to text. But users had to say the right words to make it work. Background noises could also muddle the translation. His vision was to develop technology that could recognize the context of speech, picking up the right cues in a conversation to answer like a human would. ... 'It looks for clues in what you're saying and what you've said before to infer what you want, just like a human would,' said Mike Kennewick, the founder's brother and current CEO of VoiceBox. ... Other big players in advanced search recognition, or voice recognition, include IBM, Microsoft and ScanSoft. Microsoft, for example, sells voice-recognition technology with its operating system for automobiles, but the system responds to commands rather than to contextual speech."
>>> Discourse Understanding, Natural Language Processing, Speech, Transportation, Applications

December 12, 2005: High-Tech for Seniors Moves Beyond Clapper. By Randolph E. Schmid. The Associated press / available from the Chicago Tribune. "New technologies for seniors, supplementing conveniences like The Clapper and emergency warnings like Life Alert, are on display this week at the White House Conference on Aging. The goal is to provide technologies that 'help seniors and their families live happy and healthy in their own home,' said Eric Dishman, chairman of the Center for Aging Services Technologies, or CAST, and general manager and global director of Intel Health Research and Innovation Group. ... Health Watch has a medicine cabinet that can be programmed to keep track of what medicine it holds and when it should be taken. A built-on camera scans the face of the person at the cabinet and a voice can remind that it's time to take a pill. If the wrong bottle is chosen, the voice warns of the error."

  • Also see:
    • Tech companies gear up for seniors. By Michael Kanellos. CNET News.com (December 12, 2005).
    • Gadgets for when we get old. Kevin Maney's blog at USAToday.com (December 12, 2005). "So I spent the morning at the White House Conference on Aging, mostly staying away from the policy wonks and instead roaming the technology exhibit. I got to see what my life might be like circa 2045.... Star of the show floor was Pearl the nursebot.... Best of all, I met up with James Allen of the University of Rochester, father of Chester the Talking Pill. As Allen explains, when you’re taking 12 different pills, it’s hard to keep track of interactions, side effects, dietary restrictions and when you can have a martini without making your insides melt. Chester is the front end of a database of your prescriptions. Have a question -- just ask Chester. Incidentally, Chester is built on basic speech-recognition research the university is doing under a Department of Defense contract."
    • Who cares for the elderly? Commentary by Arlie Hochschild. Los Angeles Times (December 10, 2005).

>>> Assisitive Technologies, Robots, Vision, Speech, Systems, Applications, Events (@ Resources for Students); also see this related article

November 30, 2005: Interview with Dr. Timothy Tuttle, CEO of Video Search Company, Truveo. By Tracy Swedlow. Interactive TV Today [itvt] Bloggit. "Tuttle recently spoke to [itvt]'s Tracy Swedlow about why Truveo believes its technology's ability to crawl dynamic Web sites gives it a crucial advantage in the video search space, about the company's business model, about how its technology attempts to 'look' at Web sites in the same way that a person would, and more. ... Tuttle: ... You see, the big problem with video search--and everyone in the industry knows this--is that while search works great for Web pages, it's a much harder problem to find and index videos on the Web. The reason it's so hard is that it's very difficult for the typical crawling technologies to even see the video on the Web. If they go to a Web site and try to find the video, it's very hard for them to do that. ... [itvt]: How does your technology attempt to 'see' the visual characteristics of a Web page? Tuttle: What we try to look at is the rendered and instantiated version of a functioning Web application. Think of it as similar to looking at the screen buffer. We're looking at a screen shot, as it were, of the rendered page, in order to see if there's a section of that page that may have video playing. And also to look around it, to see if there's any other information that's displayed that relates to that video. ... [itvt]: Is Truveo interested in artificial intelligence technologies that would allow searches of visual content directly--i.e. searches of images themselves? Tuttle: We have a bunch of Ph.D's here who've spent a lot of time either working in research labs or universities on technologies for things like video metadata extraction. There are lots of techniques that are being researched right now. Frankly, people have been working on things like image analysis, object recognition, and scene detection for the past 15 years. I definitely think there is hope that those technologies might be useful in the future for doing automated analysis of images, and then--potentially--video. ... One of the techniques that a lot of the search companies are focused on right now--including us--is using technologies like voice recognition, in order to do a better job of extracting metadata from a video file.
>>> Information Retrieval, Web-Searching Agents, Image Understanding, Speech, Natural Language Processing, Applications, Agents, Vision, Machine Learning, Interviews

November 21, 2005: Computer R&D rocks on - Recomputing the Future (first of three parts). By Rick Merritt. EE Times. "Think computers have become a commodity, like pork bellies, and computer science an old set of solved problems? Think again. The computer research agenda is as big as ever before, if not bigger. Experts see important breakthroughs and whole new fields of investigation just opening up. Advances will come in natural-language searches, machine learning, computer vision and speech-to-text, as well as new computing architectures to handle those hefty tasks. Beyond the decade mark, Edward D. Lazowska, a professor of computer science at the University of Washington, expects computers based on quantum physics.
>>> Systems, Machine Learning, Natural Language Processing, Speech, Vision, Applications

October 25, 2005: Futurists Pick Top Tech Trends. By Joanna Glasner. Wired News. "[L]et's take a look at the positive trends futurists see on the horizon. ... Speech-recognition technology will be instrumental in enabling new mobile services, said Ronald Gruia, author of the blog Technology Futurist and emerging communications program leader at consulting firm Frost & Sullivan. In recent years, speech software developers, in particular Nuance Communications, which until recently went by the name ScanSoft (SSFT), have gotten much better at what they do. Gruia believes it's only a matter of time before speech-enabled mobile apps for tasks like composing e-mail while driving can be commonplace."
>>> Speech, Natural Language Processing, Telecommunications, Applications, The Future

October 5, 2005: Devices help the blind cross tech divide. By Michael Singer. CNET News.com. "Jerry Swerdlick runs a 15-employee company that resells computers and devices that aid people with visual, hearing, learning and other physical disabilities. Business is really booming these days, Swerdlick said, as more and more manufacturers are building so-called assistive technology gadgets to address a wide range of special needs groups. ... Swerdlick's EVAS is part of a $5.4 billion assistive technology industry, according to the Smithsonian Institution. That's nearly double market estimates six years ago. The market itself is broad. Some of the devices that are becoming increasingly common include Braille-based handheld devices with text-to-speech technology, tactile keyboards with oversize characters, and pointing devices that control PCs with a movement of an eyebrow. An aging population in industrialized countries combined with a government effort to satisfy more special needs groups is lighting a fire under this industry, which adds 10 to 20 new companies every year, Assistive Technology Industry Association (ATIA) executive director David Dikter said. ... Microsoft, for one, has been taking a hard look at the issue. ... Apple Computer, Adobe and IBM have been working on speech recognition and screen enlargement software for their various applications. ... Smaller companies such as Freedom Scientific, HumanWare AgentSheets, WizCom Tecnologies, Digital Lifestyle Outfitters and DynaVox are also among the hundreds of assistive technology companies that the ATIA endorses. ... Some recent product examples include: ... "
>>> Assisitive Technologies, Systems, Natural Language Processing, Interfaces, Image Understanding, Speech, Industry Statistics, Applications

October 3, 2005: Speech-Recognition Technology Advances (radio broadcast). Reported by Lisa Chow for NPR's Morning Edition. "Speech-recognition technology has been around since the 1960s, when computer scientists were trying to mimic the complexities of human speech. Now, the technology has become an everyday feature as people talk to computers in the office, cars and more." [audio available]
>>> Speech, Natural Language Processing, Applications

September 26, 2005: They made the Internet. Now they want to make money. BBN turns its focus to new technologies for use in wartime. By Robert Weisman. The Boston Globe. "With new ownership and a new strategy, BBN [formerly known as Bolt, Beranek & Newman] wants to profit off the new century's next big thing: the war on terror. The company hopes to benefit financially from the speech recognition, network security, and wireless mobile technologies it is pioneering for use in Iraq and elsewhere. ... The speech processing research it began in the mid-1970s is one key to the future of BBN and its 650 employees. Next month the company expects to land a contract from the Pentagon's research arm, the Defense Advanced Research Projects Agency, to lead an effort to develop technology that immediately translates spoken languages, such as Arabic or Mandarin Chinese, into searchable English text. That contract will be for a Darpa program known as GALE, for global autonomous language exploitation. Valued at more than $15 million for its first year, it will be one of BBN's largest contracts ever."
>>> Natural Language Processing, Machine Translation, Speech, Military, Applications

September 9, 2005: Duke phones installed with voice recognition. By Shivam Joshi. The Chronicle Online. "Researchers have developed a voice-recognition system that allows a caller to be forwarded to the person they are trying to reach -- even if they don't have the person’s phone number. ... The system uses error-correction technology developed by Professor of Computer Science Alan Biermann, research associate Ashley McKenzie and computer science graduate student Bryce Inouye. Typical voice recognition systems work accurately two-thirds of the time. But when the software is combined with error-correction technology, the number of misplaced calls and unrecognized names drops to less than 10 percent."
>>> Speech, Natural Language Processing, Telecommunications, Applications

August 25, 2005: A Doll That Can Recognize Voices, Identify Objects and Show Emotion. By By Michel Marriott. The New York Times (registration req'd.). "Judy Shackelford, who has been in the toy industry for more than 40 years, has seen a lot of dolls. But none, she says, like her latest creation, a marvel of digital technologies, including speech-recognition and memory chips, radio frequency tags and scanners, and facial robotics. She and her team have christened it Amazing Amanda. ... 'The speech-recognition chip running in Amazing Amanda acts not only as speech recognition, but also allows her to talk,' said Todd Mozer, chief executive of Sensory, a speech-technology company in Santa Clara, Calif., that developed the chip used in the doll. He noted that the technology could interpret a range of languages and dialects."
>>> Toys, Speech, Applications, Robots, Emotion

August 17, 2005: Lending an ear to new technology. By Barry Ellsworth. Belleville Intelligencer. "[Hearing loss] is the third most prevalent chronic disease among older adults, according to statistics from the Canadian Association of Speech-Language Pathologists and Audiologists. The association said that 20 per cent of people aged 65 have hearing loss, 40 per cent of those aged 75 and 80 per cent of those who live in nursing homes. ... A new $3,000 hearing aid has artificial intelligence and a brace of microphones that open and close, zeroing in on where you look and screening out other sounds, said a representative for Tim Davidson Hearing Services of Belleville. 'They have done some advancement to try to decrease that background noise,' [Alana] DeVille said."
>>> Assisitive Technologies, Speech, Applications

August 2, 2005: Lucas - Future in Asia, vid games. By Sheigh Crabtree. The HollywoodReporter.com. "Think George Lucas is going to take it easy now that he's completed his final 'Star Wars' trilogy? Think again. ... 'I put all of my resources into pushing the evolution in an industry that is notoriously backwards and I enjoy pushing that envelope,' Lucas told the thousands of Siggraph attendees who squeezed into a hall at the Los Angeles Convention Center for Lucas' hourlong Q&A session. ... Lucas said the next breakthrough in gaming is artificial intelligence and voice recognition. 'I want to get to a point where you can talk to the game and it will talk back, Lucas said. 'I'm really pushing for advances in artificial intelligence and intelligent voice recognition technology. I think that will change games from first person shooters narratives to intelligent and challenging first-person shooter type dramas.'"
>>> Video Games, Natural Language Processing, Speech, Interfaces, Applications

July 13, 2005: Artificial intelligence has invaded the medical world, serving in roles from scrub nurse to doctor stand-in. By Delthia Ricks. SunSentinel.com. "Meet Penelope, the new scrub nurse at New York Presbyterian Hospital in Manhattan: a robot with a job that in many hospitals is held by humans with college degrees. Penelope is not just any old robot, but one blessed with artificial intelligence, an ability to 'see' and the capacity to 'hear.' Her human colleagues call her a star employee. 'She's not here to replace a nurse. She's here to free up a nurse, to let nurses spend more time with the patients,' said registered nurse Doreen Taliaferro, who herself has worked in the scrub role at New York Presbyterian. Taliaferro said she is not the least bit threatened by Penelope's presence. 'There is far more important work for human nurses to perform,' she said. ... The robot's artificial intelligence enables it to be smarter than the average computer. ... Through highly sophisticated programming, Penelope is capable of reasoning and making choices. ... [Penelope's inventor, Dr. Michael] Treat confesses to a lifelong fascination with robots, dating back to the late 1950s when, as a child, he thought the small, battery-powered Robby the Robot was the coolest - although a tad unsophisticated and a bit clumsy. ... 'There are scads of innovative things going on in this field now. I can see robots that help kids learn in school. Robots as personal companions. More robots in hospitals. We're at the very beginning, and the future looks very bright.'"
>>> Medicine, Robots, Vision, Natural Language Processing, Speech, Reasoning, Neural Networks, Assisitive Technologies, Ethical & Social Implications, Science Fiction, Applications

June 27, 2005: Space station gets HAL-like computer. By Maggie McKee. NewScientist.com news. "A voice-operated computer assistant is set to be used in space for the first time on Monday -- its operators hope it proves more reliable than 'HAL', the treacherous speaking computer in the movie 2001. Called Clarissa, the program will initially talk astronauts on the International Space Station through tests of onboard water supplies. But its developers hope it will eventually be used for all computer-related work on the station. ... Clarissa queries astronauts about the details of what they need to accomplish in a particular procedure, then reads through step-by-step instructions. Astronauts control the program using simple commands like 'next' or more complicated phrases, such as 'set challenge verify mode on steps three through fourteen'."
>>> Natural Language Processing, Speech, Space Exploration, Interfaces, Science Fiction, Applications; also see this related article

June 27, 2005: Emotional issue. The Engineer. "An emotion-sensitive computer system that can detect customers’ anger or frustration and react accordingly could be used in BT call centres. The UK telecoms giant is a partner in the EU-backed ERMIS project, which has created a prototype computer character able to understand and replicate a variety of human emotions. These range from straightforward anger through to more subtle states such as sadness or boredom. ... Martin Spott, BT’s principal researcher on the project, said: 'We are looking into using this technology in call centres to help in customer service by detecting the caller’s underlying emotions and reacting appropriately.' ... Spott believes emotion-recognition systems like ERMIS could be used to care for the elderly within an ambient technology framework. If the system detected that an elderly person had been unhappy for a long period, it would automatically alert a staff member."
>>> Customer Service, Assisitive Technologies, Interfaces, Speech, Emotion, Applications

June 24, 2005: NASA, Xerox to Demonstrate 'Virtual Crew Assistant'. NASA press release / available from SpaceRef. "Intelligent conversation with robots - long the bread and butter of science fiction authors - soon may take another step closer to reality for astronauts on the International Space Station. Scientists from NASA Ames Research Center in California's Silicon Valley and Xerox Corporation (NYSE:XRX - News) will demonstrate a sophisticated, voice-operated computer system on June 26 at the Association for Computational Linguists' 25th annual meeting at the University of Michigan, Ann Arbor. Called Clarissa, the system was developed in an effort to ease astronaut workload. 'Clarissa is a fully voice-operated 'virtual crew assistant,' enabling astronauts to be more efficient with their hands and eyes and to give full attention to the task while they navigate through the procedure using spoken commands,' said Beth Ann Hockey, project lead on the team that developed Clarissa at NASA Ames. ... Clarissa is 'hands-free' and responds to astronauts' voice commands, reading procedure steps out loud as they work, helping keep track of which steps have been completed, and supporting flexible voice-activated alarms and timers. ... In 2004, Clarissa lead implementer Manny Rayner of NASA Ames contacted Xerox researcher Jean-Michel Renders of Xerox Research Centre Europe in Grenoble, France, about a possible collaboration. They hoped that Xerox's experience in machine learning, linguistics and text categorization would increase the system's accuracy on the 'open microphone' task. ... The technology developed by Renders to address the NASA speech-recognition problem is also being used at Xerox to improve categorization results for printed or digital documents."
>>> Discourse Analysis, Natural Language Processing, Speech, Space Exploration, Machine Learning, Interfaces, Information Retrieval, Conferences (@ Resources for Students), Applications

June 22, 2005: Police officers riding high-tech at expo in A.C. By Julia Glick. PressofAtlanticCity.com. "Police officers test-drove a stealthy chariot that can chase down suspects without a drop of gas and gave voice commands to a patrol car with artificial intelligence at the Police Security Expo on Tuesday at the Convention Center. ... The [Info-Cop] company's Project 54 makes police cars a little bit more like KITT, the talking, thinking car from the '80s TV show 'Knight Rider.' Instead of fumbling with sirens, lights and other controls or combing hundreds of radio frequencies, the officer can keep his eyes on the road and simply tell the car what to do, said William Lenharth, the project director and a professor at the University of New Hampshire. The system prevents accidents and helps police pursue criminals, he said."
>>> Law Enforcement, Speech, Natural Language Processing, Transportation, Applications

June 21, 2005: Speech recognition by humanoid robot in real environment. News-Medical.Net . "Japan's National Institute of Advanced Industrial Science and Technology (AIST), an independent administrative institution, has developed a speech recognition function in real environment using an array of microphones, successfully extending the sensing capability of humanoid robot under the Humanoid Robotic Project HRP-2 'Prométhée'. ... Stable speech recognition is obtained by combining information derived from the microphone array and the camera and by isolating and eliminating noises. ... In the living environment, where practical use of next generation robots is expected, direct human-robot interaction through voice channel is growing to one of key perceptive functions of robot. ... The present study has made it possible to install a voice interface on a humanoid robot operable in the environment involving a lot of sound sources."
>>> Speech, Robots, Vision, Applications

June 16, 2005: Meet Penelope - The Robot Nurse. By Dr. Jay Adlersberg. WABC / 7Online.com. "Seven's On Call with high-tech surgery, specifically a robotic assistant that can recognize surgical instruments and hand them to the surgeons. How does it work? ... Robots in the operating room are not new, some now assist surgeries. But Penelope's specific role will be handing over and keeping track of the surgical instruments. ... Dr. Amory performed the simple surgery this morning with Penelope as part of the surgical team. The surgery removed a benign cyst from the arm of Iris Lopez, a patient at the hospital. Penelope's software recognizes the voice command. She can scan the instruments and retrieve the correct one. "
>>> Robots, Natural Language Processing, Speech, Vision, Medicine, Applications

June 14, 2005: Cyber servant. The Engineer. "The notion of a computer with human attributes has long been standard sci-fi fodder. But now engineers from Philips in Germany aim to bring the concept into the home courtesy of the Smart Companion: a nodding, talking and listening robot designed to act as a friendly link between the human and the digital world. Hans Driessen, spokesman for Philips Home Dialogue Systems (HDS), which developed the Smart Companion, said the device combines advances in robotics and image processing, as well as face, gesture and speech recognition, to provide an unthreatening and intuitive interface with the digital devices in a user’s home."
>>> Robots, Interfaces, Speech, Vision, Natural Language Processing, Assisitive Technologies, Smart Houses, Applications

June 10, 2005: Rodents' Talk Isn't Just 'Cheep.' By Jeff Rice. Wired News. "Imagine a device that would let you 'talk' with your dog or cat. One that could help you ask a cow a question or converse with a dolphin. Two Arizona scientists say computers may someday bridge the language gap between humans and other animals. 'You could have this little thing hooked to your belt and you could speak and it could be translated into animal language,' says John Placer, a computer scientist at Northern Arizona University. It sounds like pure science fiction, and at this point it is. 'We're a long way off from that,' says Placer. But he and biologist Con Slobodchikoff, also of NAU, are working toward this very goal. The two are applying principles of artificial intelligence and fuzzy logic to animal language systems in the hope of cracking the code. ... Several years ago, Slobodchikoff began collaborating with Placer. The two began using customized speech-recognition software to slice up prairie dog sounds and to search for deeper, hidden linguistic patterns. ... So far, they have trained a computer to recognize three key prairie dog calls more than 90 percent of the time based on this technique."
>>> Speech, Machine Learning, Fuzzy Logic, Machine Translation, Natural Language Processing

June 5, 2005: Play It Again, Vladimir (via Computer). By Anne Midgette. The New York Times (registration req'd). "This is the new world of computer music. In its infancy, way back in the 1960's, the goal was to use digital technology to create new sounds and new musical forms. Today scientists around the world are turning computers on human performance, seeking to quantify an element once thought to be intangible: the expressivity of a human artist. ... The reactions demonstrate a basic difficulty with mechanical reproduction of music: there is a subjective element involved in determining if it works. The final criterion for any such reproduction is the rather imprecise 'Turing test' of artificial intelligence: that is, whether it can make the listener think he or she is hearing a person rather than a machine. At the Austrian Research Institute for Artificial Intelligence, a group of leading researchers known as the Machine Learning, Data Mining and Intelligent Music Processing Group are trying to pinpoint just what it is that fools the ear. Led by Gerhard Widmer, they are looking at everything from improving the way computers 'hear' music to isolating the elements of individual performance style, as well as creating graphs and animations to illustrate different pianists' interpretations of the same passage of music. In a 2003 paper, 'In Search of the Horowitz Factor,' Dr. Widmer and his team described giving the computer 13 recordings of Mozart piano sonatas, played into a Bösendorfer Disklavier by the pianist Roland Batik, to see if they could use the computer to determine rules that described the pianist's interpretive choices. ... [T]here's still the thorny matter of how to get data from an audio recording into the computer. It's a question not just of having the computer play back a CD, but of translating the music into a language the computer can understand. A computer, by itself, can't recognize the difference between a note of music and a cough."
>>> Music, Machine Learning, Representation, Speech, Turing Test, Applications

June 2005: Conversational Computers. By Andy Aaron, Ellen Eide and John F. Pitrelli. Scientific American (subscription req'd). "Call a large company these days, and you will probably start by having a conversation with a computer. Until recently, such automated telephone speech systems could string together only prerecorded phrases. ... Computer-generated speech has improved during the past decade, becoming significantly more intelligible and easier to listen to. But researchers now face a more formidable challenge: making synthesized speech closer to that of real humans--by giving it the ability to modulate tone and expression, for example--so that it can better communicate meaning. This elusive goal requires a deep understanding of the components of speech and of the subtle effects of a person's volume, pitch, timing and emphasis. That is the aim of our research group at IBM and those of other U.S. companies, such as AT&T, Nuance, Cepstral and ScanSoft, as well as investigators at institutions including Carnegie Mellon University, the University of California at Los Angeles, the Massachusetts Institute of Technology and the Oregon Graduate Institute."
>>> Speech, Natural Language Processing, Customer Service, Applications

May 28, 2005: Machines' way with words. By Stephen Evans. BBC News. "It has happened to most of us. The phone call you make to your bank is answered by a talking machine. It asks questions, you answer and then it asks more questions. Voice recognition systems are becoming more prevalent... and scarily efficient. ... Banks, phone companies, railways and all kinds of alleged helplines, are spending a lot of money trying to find out what kinds of voices they should give the machines that speak to us, the public, on their behalf. Much of the research is conducted in a small room - Room 325 in McClatchy Hall at Stanford University in California. It is the site of the dryly entitled but fascinating Laboratory for Communication Between Humans and Interactive Media, which is the domain of a genial, enthusiastic professor called Clifford Nass who studies how people and machines get on, particularly when the machines talk to the people."
>>> Speech, Interfaces, Customer Service, Applications

April 28, 2005: Robots in the OR -- Stat! Penelope the robot may free nurses to do more "human" tasks. By Josh Chamot. National Science Foundation Discoveries. "As the decade unfolds with its shortage of nurses, the sheer volume of patients each nurse must care for is leading to a critical burden for each of these professionals. While nurses will always be crucial to the care of patients, certain jobs may soon be accomplished by sophisticated robots. Surgeon Michael R. Treat and his team at Robotic Surgical Tech, Inc. have developed a robotic surgical assistant, named 'Penelope,' to perform tasks usually assigned to the scrub nurse. All of Penelope's talents are made possible by the innovative application of artificial intelligence to surgical situations." See the related video: Surgeon and "Penelope" in the operating room.
>>> Vision, Speech, Natural Language Processing, Robots, Medicine, Interfaces, Applications

April 9, 2005: Will machines ever understand us? By Justin Mullins. New Scientist (Issue 2494; subscription req'd.). "If you have ever called a directory enquiries or flight information service, the chances are that you have spent a few happy minutes speaking to a computer. And according to some business analysts, talking to a computer in this way will soon become an everyday experience, one that changes the way we live and work. This future relies on one enabling technology: voice-recognition software. Yet experts in the field can't decide quite how well things are progressing. Some say it will be 50 years before a computer can truly understand what we say; others believe you'll soon be chatting to your fridge. According to the New York-based business information company Datamonitor, the North American market for speech-recognition software will grow by more than 25 per cent each year between 2005 and 2008."
>>> Speech, Natural Language Processing, Customer Service, Interfaces, Industry Statistics, Applications; also see this related article from the same issue

April 9, 2005: Non-acoustic sensors detect speech without sound. By David Hambling. New Scientist (Issue 2494; page 21). "DARPA is also pursuing an approach first developed at NASA's Ames lab, which involves placing electrodes called electromyographic sensors on the neck, to detect changes in impedance during speech. A neural network processes the data and identifies the pattern of words. The sensor can even detect subvocal or silent speech. The speech pattern is sent to a computerised voice generator that recreates the speaker's words."
>>> Speech, Systems, Neural Networks, Pattern Recognition, Machine Learning, Applications; also see this related article from the same issue

April 5, 2005: The robot nurse. The Engineer Online. "Surgeon Michael R. Treat and his team at Robotic Surgical Tech have developed a robotic surgical assistant, named Penelope, to organise and manage all the surgical instruments used in an operating theatre -- essentially replacing all the jobs usually assigned to a scrub nurse.... Penelope uses voice recognition to respond to a surgeon’s request for an instrument, handing it to the surgeon with a robotic arm. Using a visual processing capability, Penelope also retrieves the instrument when it is no longer needed. For safety, the robot even assists in keeping track of the number of surgical instruments used, helping to ensure that none are accidentally left inside the patient! Penelope anticipates which instrument the surgeon will need next and selects that item from its tool kit, just as an experienced scrub nurse would. And like a nurse, Penelope can learn the instrument preferences of various surgeons."
>>> Medicine, Robots, Speech, Vision, Machine Learning, Applications

April 4, 2005: Talking Loudly to the TV Set, and Maybe Getting a Response. By Ken Belson. The New York Times (registration req'd.). "At least two companies, the OneVideo Technology Corporation and Agile TV, are developing speech-recognition products that will let viewers change channels with voice prompts like 'search' and 'find.' Though a voice-activated channel changer is not a new idea, the current incarnation has been prompted as much by advertising as by convenience. The device would make it easier for consumers to order a movie, a pizza or a car dealer's brochure, eliminating the need to dial toll-free numbers or to scroll through menus on the television."
>>> Speech, Applications

March 28, 2005: Gene Finding with Hidden Markov Models- The application of phylogeny to HMMs is improving gene annotation. By Karen Heyman. The Scientist (Volume 19, Issue 6). "HMMs are special instances of graphical models, which were originally developed by computer scientists studying machine learning and speech recognition. In technical parlance, says [Sean] Eddy, HMMs 'describe a probability distribution over an infinite number of sequences.' To the uninitiated, they resemble a cross between a flow chart and a doodle. In order to understand conceptually how HMMs work, consider their origin in speech recognition, says [David] Haussler. In that field, a computer is asked, given a speech wave, what are the phonemes (sounds) that it encodes. The wave is the measured signal; the phonemes are the 'hidden' signals that give the HMM its name. 'There is a probabilistic relationship between phonemes,' Haussler explains. 'After a 'th' sound can easily come an 'r' or an 'ah' or several other types of sounds, but not, for example, a 'k' sound. A hidden Markov model for speech incorporates all possible phonemes, and for each phoneme the probability that it's followed by any other phoneme.' Haussler says the HMM also 'models the stochastic relationship between each phoneme and the speech wave one might measure for it. In this way it can be used to infer the sequence of phonemes that best fits a given segment of recorded speech.' Translating that to molecular biology, he explains, the measured signal is the sequence of nucleotides, while the hidden signal is their function. 'Biology is trying to speak a language to us, and the HMM model is helping us to distinguish the phonemes of that language.'"
>>> Speech, Bioinformatics, Probability, Reasoning, Andrei Andreyevich Markov (@ Namesakes)

March 25, 2005: New Support-Center Tool Detects Emotion In Voice Of Disgruntled Callers - Software automatically alerts supervisors when customers voice frustration about company's goods and services. By Eric Chabrow. InformationWeek. "Keeping customers happy is crucial for most businesses, and knowing when they're disgruntled is important to the Madison, Wis., health insurer. Last year, WPS [Wisconsin Physician Services Insurance Corp.] began using new software that provides this insight. The software is called Perform and was created by call-center software provider Nice Systems Ltd., an Israeli company, which began widely marketing the product this month. ... What's next for emotion detection software? Artificial intelligence. Instead of users defining keywords and emotions, the software itself will figure things out, such as by analyzing voice pitch levels, a key determinant in emotion detection. By analyzing pitch, as well as tone, tempo, and inflection, the software in the not-too-distant future could be used to detect fraud. It already can differentiate between real anger and someone mimicking anger."
>>> Customer Service, Fraud Detection & Prevention, Emotion, Speech, Applications

March 23 / 30, 2005: Common sense boosts speech software. By Eric Smalley. Technology Research News. "Speech recognition software matches strings of phonemes -- the sounds that make up words -- to words in a vocabulary database. The software finds close matches and presents the best one. The software does not understand word meaning, however. This makes it difficult to distinguish among words that sound the same or similar. The Open Mind Common Sense Project database contains more than 700,000 facts that MIT Media Lab researchers have been collecting from the public since the fall of 2000. These are based on common sense like the knowledge that a dog is a type of pet rather than the knowledge that a dog is a type of mammal. The researchers used the phrase database to reorder the close matches returned by speech recognition software."
>>> Speech, Commonsense, Natural Language Processing, Interfaces, Applications, Reasoning

March 11, 2005: Humanoids With Attitude - Japan Embraces New Generation of Robots. By Anthony Faiola, with Akiko Yamamoto. Washington Post (registration req'd.) / also available from The Detroit News (Japan embraces new generation of robots; March 12, 2005) and from The Sydney Morning Herald (We, robot: the future is here; March 14, 2005). "'I almost feel like she's a real person,' said Kobayashi, an associate professor at the Tokyo University of Science and [Saya,the cyber-receptionist's] inventor. Having worked at the university for almost two years now, she's an old hand at her job. 'She has a temper . . . and she sometimes makes mistakes, especially when she has low energy,' the professor said. Saya's wrath is the latest sign of the rise of the robot. Analysts say Japan is leading the world in rolling out a new generation of consumer robots. Some scientists are calling the wave a technological force poised to change human lifestyles more radically than the advent of the computer or the cell phone. ... In the quest for artificial intelligence, the United States is perhaps just as advanced as Japan. But analysts stress that the focus in the United States has been largely on military applications. By contrast, the Japanese government, academic institutions and major corporations are investing billions of dollars on consumer robots aimed at altering everyday life, leading to an earlier dawn of what many here call the 'age of the robot.' But the robotic rush in Japan is also being driven by unique societal needs. ... It is perhaps no surprise that robots would find their first major foothold in Japan. ... 'In Western countries, humanoid robots are still not very accepted, but they are in Japan,' said Norihiro Hagita, director of the ATR Intelligent Robotics and Communication Laboratories in Keihanna Science City near Kyoto. 'One reason is religion. In Japanese [Shinto] religion, we believe that all things have gods within them. But in Western countries, most people believe in only one God. For us, however, a robot can have an energy all its own.'"
>>> Robots, Customer Service, Manufacturing, Assisitive Technologies, Speech, Natural Language Processing, Science Fiction, Ethical & Social Implications, Applications

March 9, 2005: Improving the efficiency of electronic patient records. IST Results. "Transcribing dictated notes from clinicians is a hugely expensive and time-consuming process. One possible cure could be a new solution under validation that offers speech recognition and secure wireless communication for electronic-patient-record systems. ... The IST project DICTATe (finishing June 2005) aims to both increase the efficacy of such patient records, and at the same time reduce the costs incurred in preparing them. ... The project partners believe that DICTATe will pave the way for much wider deployment of speech-processing technologies into electronic-patient-record systems. Speech recognition is now poised to overcome the previous obstacles of natural language and categorisation, and become a cost-effective means of clinical reporting and integration into the medical record."
>>> Speech, Natural Language Processing, Medicine, Knowledge Management, Applications

March 9, 2005: Machines Not Lost in Translation. By Ann Harrison. Wired News. "Faced with daunting translation problems in war and disaster zones around the world, the U.S. military is refining a handheld voice-translation device that will soon be used by police and emergency-room doctors back home. The palm sized PDA-like Phraselator lets users speak or select from a screen of English phrases and matches them to equivalent pre-recorded phrases in other languages. The device then broadcasts the foreign-language MP3 file and records reply dialog for later translation. ... According to Phraselator software developer Jack Buchanan, the accuracy of translating voice into text is above 70 percent. But the middle step of translating that text into a foreign language text before outputting the data again as voice is technically difficult. 'Taking into account cultural differences and context issues is an extremely hard problem,' says Buchanan, who believes that developing something close to Star Trek's 'universal translator' will be harder than building the Enterprise. 'When you are coming in and giving food to a village, how you would say 'hello' is totally different than if you are a military person at a checkpoint holding a gun pointed in their direction.' ... In 2003, DARPA estimated that open-domain, multi-task and unconstrained dialog translation was still five to 10 years away. But the research group developing IBM's MASTOR, or multilingual automatic speech-to-speech translator system, says its DARPA-funded bidirectional voice translator is a year or two from deployment."
>>> Machine Translation, Natural Language Processing, Speech, Military, Medicine, Law Enforcement, Applications, The Future

February 23, 2005: Termites feed through good vibrations. Report from Commonwealth Scientific and Industrial Research Organisation (CSIRO). Available from the innovations-report. "Discovery that termites use vibrations to choose the wood they eat may provide opportunities to new methods of reducing infestations in homes and also may provide insights into the 'cocktail party effect' of signal processing - how to ignore most noise but have some signals that trigger attention - that may prove useful in artificial intelligence."
>>> Speech

February 17, 2005: A Virtual Course in Iraqi Arabic. By Ina Jaffe. Radio broadcast of NPR's All Things Considered. "About 100 U.S. soldiers and Marines serving in Iraq will soon have a new tool intended to help keep them safe, and perhaps make their jobs easier -- a computer game designed to teach them how to speak Iraqi-style Arabic. NPR's Ina Jaffe reports on the game's appeal to a new generation of troops already familiar with interactive video games. Dr. Lewis Johnson, director of the Center for Advanced Research in Technology for Education (CARTE) at the USC Information Sciences Institute, created the game Tactical Iraqi using the base gaming 'engine' used by the popular point-and-shoot game Unreal Tournament, plus voice recognition and artificial intelligence software."
>>> Military, Video Games, Education, Speech, Applications

January 26, 2005: Opera, the Forgotten Browser. By Michelle Delio. Wired News. "Voice Interaction, Opera's splashiest new feature, allows users to control the program by talking to their computers. Websites, e-mail and documents can also be read aloud by the browser. 'Voice Interaction is a progressive part of our vision that web browsing will soon move more into mobile phones and other small devices and browsing will need to be a hands-free experience,' said Michelle Valdivia, Opera marketing communications manager. 'This is very early, premature technology, but Opera wants to get ahead, get it out there and into practice to be future-ready.'"
>>> Speech, Information Retrieval, Telecommunications, Applications, Systems

January 26, 2004: Cars that Think. PBS television broadcast of Scientific American Frontiers show. "The fully automatic car may be down the road a ways, but cars that do your thinking for you are just around the corner -- they watch out for hazards, they listen to you, they read your lips, they even know when you're distracted."
>>> Transportation, Speech, Interfaces, Vision, Applications

January 18, 2005: For Surgery, an Automated Helping Hand. By Marc Santora. The New York Times (registration req'd.). "'Meet Penelope,' Dr. [Michael R.] Treat said, motioning toward a robotic arm poised over a set of surgical tools. ...She is meant to replace the scrub nurse, the person in the operating room who hands the surgeon the tools of surgery. Responding to the ever-widening shortage of nurses in the country, and looking to deal with a problem that frustrated him as a working surgeon, Dr. Treat and his team of tech whizzes are working feverishly to get Penelope ready for her public debut. New York-Presbyterian Hospital has agreed to test Penelope in March in the operating room on a simple removal of a benign cyst. ... Some of Penelope's technology is off the shelf, like the voice recognition software. Dr. Treat said that this way, as others develop better software, they can update Penelope with relative ease. The major innovation is in Penelope's visual recognition, the ability to distinguish between surgical tools. Currently, Penelope can recognize 12 tools and will soon be able to recognize twice that many. That is harder then it might sound, because the tools often look very much alike."
>>> Vision, Speech, Natural Language Processing, Robots, Medicine, Interfaces, Applications

January 17, 2005: Car, play me Eminem's latest hit. By John Borland. CNET News.com. "The company says it's developing voice-recognition software that will help drivers maneuver though hard drive-based car music systems that hold thousands or even tens of thousands of songs. ... .'Pushing buttons can be challenging when you're driving down the road at 80 miles an hour,' said Ross Blanchard, Gracenote's vice president of business development. 'The reason we thought we could do this now is that they've worked out the problems with voice recognition in the navigation and telematics market.'"
>>> Speech, Natural Language Processing, Interfaces, Transportation, Applications

January 12/19, 2005: Conversations control computers. By Eric Smalley. Technology Research News. "Because information from spoken conversations is fleeting, people tend to record schedules and assignments as they discuss them. Entering notes into a computer, however, can be tedious -- especially when the act interrupts a conversation. Researchers from the Georgia Institute of Technology are aiming to decrease day-to-day data entry and to augment users' memories with a method that allows handheld computers to harvest keywords from conversations and make use of relevant information without interrupting the personal interactions. ... The researchers' system protects privacy by only using speech from the user's side of the conversation, said [Kent] Lyons."
>>> Interfaces, Speech, Agents, Natural Language Processing, Applications

January 8, 2005: Voicemail software recognises callers' emotions. By Celeste Biever. New Scientist Magazine. "A voicemail system that labels messages according to the caller's tone of voice could soon be helping people identify which messages are the most urgent. The software, called Emotive Alert, is designed by Zeynep Inanoglu and Ron Caneel of the Media Lab at the Massachusetts Institute of Technology. ... Another British company, Edinburgh-based Affective Media, will soon be selling software for cars that detects drowsiness and frustration in a driver's voice as he or she asks the in-car navigation system for directions, and will attempt to wake the driver up or calm them down, as appropriate. It could also be used in computer games to detect boredom levels and spice up the action accordingly."
>>> Speech, Machine Learning, Emotion, Interfaces, Natural Language Processing, Telecommunications, Transportation, Video Games, Cognitive Science, Applications

January 1, 2005: Ernestine, Meet Julie - Natural language speech recognition is markedly improving voice-activated self-service. By Karen Bannan. CFO Magazine. "If only Amtrak's Web designers were as attentive as the makers of the railroad's telephone self-service system. That system, which features the digitized voice of an operator named Julie, is a primer on good customer service. Rather than requiring Amtrak's 20 million or so yearly callers to punch in numbers, the system allows them to voice responses to questions like 'What city are you departing from?' And unlike many Web-based self-service setups, Amtrak's voice-activated operator does most of the legwork for the customer. Expect to bump into more Julies out there. A new technology, called natural language speech recognition, is markedly improving voice-activated self-service. Powered by artificial intelligence, these speech-recognition systems are altering consumer perceptions about phone self-service, as calls for help no longer elicit calls for help. That, in turn, is spurring renewed corporate interest in the concept of phone self-service. In 2004, sales of voice self-service systems topped $1.2 billion. 'We've seen voice systems move from emerging technology to applied technology over the last few years,' says Steve Cramoysan, principal analyst at Stamford, Connecticut-based research firm Gartner. 'It's still fairly immature. But it's proven and moving toward the mainstream.'"
>>> Natural Language Processing, Speech, Customer Service, Industry Statistics, Applications

 

THERE'S MORE !