TOOLBOXBROWSE TOPICS
RESOURCESABOUT THIS SITEpmwiki.org |
Machine Translation(a subtopic of Natural Language)
"What is Machine Translation? Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains." A definition from the European Association for Machine Translation (EAMT), "an organization that serves the growing community of people interested in MT and translation tools, including users, developers, and researchers of this increasingly viable technology." Me Translate Pretty One Day - Spanish to English? French to Russian? Computers haven't been up to the task. But a New York firm with an ingenious algorithm and a really big dictionary is finally cracking the code. By Evan Ratliff. Wired (December 2006; Issue 14.12). "Jaime Carbonell, chief science officer of Meaningful Machines, hunches over his laptop in the company's midtown Manhattan offices, waiting for it to decode a message from the perpetrators of a grisly terrorist attack. Running software that took four years and millions of dollars to develop, Carbonell's machine -- or rather, the server farm it's connected to a few miles away -- is attempting a task that has bedeviled computer scientists for half a century. The message isn't encrypted or scrambled or hidden among thousands of documents. It's simply written in Spanish:.... Language translation is a tricky problem, not only for a piece of software but also for the human mind."
Mark my words. The Economist (February 16, 2007). "For those who put their faith in technology, therefore, it was encouraging to hear Shinzo Abe, Japan’s prime minister, demonstrate his linguistic skills a few weeks ago with a palm-sized gizmo that provided instantaneous translations of spoken Japanese into near-flawless English and Chinese. ... [T]he fact that a pocket-sized device could interpret tourist-type phrases accurately and on the fly, from one language to several others, says much about the improvements that have been made lately in machine translation. This device, developed by the Advanced Telecommunications Research Institute International near Kyoto.... Machine translation has been an elusive goal since the earliest days of computer science. ... The main drivers for this more pragmatic approach to machine translation have been the enlargement of the European Union and the spread of the internet. Both have generated a pressing need for cheap and cheerful translations between numerous languages. In turn, this has spawned a wealth of new translation approaches." Translation Tools - New Approaches to an Old Discipline. Automated translation tools have been around for a long time, and new techniques are boosting their performance. But use them with caution. By Gary Anthes. Computerworld (August 13, 2007). "Language translation software isn’t likely to allow you to lay off your bilingual staffers -- at least not right away. But applied with discrimination and lots of preparation, translation tools can be fantastic productivity aids. And researchers say new approaches to this old discipline are greatly improving the performance of the tools. Ford Motor Co. began using 'machine translation' software in 1998 and has so far translated 5 million automobile assembly instructions into Spanish, German, Portuguese and Mexican Spanish. Assembly manuals are updated in English every day, and their translations -- some 5,000 pages a day -- are beamed overnight to plants around the world. 'It wouldn’t be feasible to do this all manually,' says Nestor Rychtyckyj, a technical specialist in artificial intelligence (AI) at Ford. ... Systran’s tool uses a tried-and-true translation technique called rules-based translation. ... Statistical machine translation is a newer technique that’s not yet in widespread use. It uses collections of documents and their translations to 'train' software. Over time, these data-driven systems 'learn' what makes a good translation and what doesn’t and then use probability and statistics to decide which of several possible translations of a given word or phrase is most likely correct based on context. ... 'The new direction in the research community is to see how you can combine these purely statistical techniques with some linguistic knowledge,' says Steve Richardson, a senior researcher at Microsoft. 'It’s modeling the rules with the statistical methods.' ... Automated translation in the corporate world succeeds to the extent that users are willing to carefully customize systems to their unique needs and vocabularies, he says. And the technology is most appropriate when translations don’t have to be perfect. 'We have serviced thousands and thousands of customers with articles we have machine-translated,' Richardson says. 'It’s not perfect, but it’s good enough. They get an answer without calling in. What’s that worth to the company' ... [H]ybrid systems, which combine translation memories and machine translation based on rules or statistics or both, are the wave of the future, researchers say, and they are becoming more sophisticated and complex. ... In essence, SRI’s approach is to do machine translations with the best available rules-based and statistical-based systems, and then have another system that 'adjudicates' among them in real time to find the best translation."
Watch Ron Brachman demonstrate the Phraselator® on The Charlie Rose Show episode: A panel discussion about Artificial Intelligence (December 21, 2004), with Rodney Brooks (Director, MIT Artificial Intelligence Laboratory & Fujitsu Professor of Computer Science & Engineering, MIT), Eric Horvitz (Senior Researcher and Group Manager, Adaptive Systems & Interaction Group, Microsoft Research), and Ron Brachman (Director, Information Processing Technology Office, Defense Advanced Research Project Agency, and President, American Association for Artificial Intelligence). The segment begins at 20:34. Top minds taxed by translation challenge - Creating a real-time translating machine is harder than it seems. By Brian Bergstein. The Associated Press /available from MSNBC.com (November 5, 2006). "The past few years have shown that U.S. government intelligence goes only so far. One of the biggest challenges is recognizing vital information in foreign languages -- and acting quickly on it. That's why the military would love software that can listen to TV broadcasts or phone conversations and read Web sites in Arabic and Chinese, translate them into English and summarize the key elements for humans. ... Last year DARPA launched a project that aims to create that real-time translation software. It’s called GALE, for Global Autonomous Language Exploitation." Tech Solutions to Iraqi-U.S. Language Barrier. Xeni Jardin's Xeni Tech report for NPR's Day to Day (November 13, 2006: audio available). "Part of the daily struggle for soldiers and Marines in Iraq is communicating with civilians and suspected insurgents. Few military personnel have enough fluency with Iraqi Arabic to be easily understood, and field translators are in short supply. But technology may help close that communications gap. A hand-held voice translator device developed by Integrated Wave Technologies, already in use in other parts of the world, converts simple English commands into Iraqi Arabic or 15 other languages." Military getting high-tech help from SRI lab - New system can recognize words, understand simple foreign phrases. By Tom Abate. San Francisco Chronicle & SFGate.com (May 29, 2006). "During a recent product demonstration at SRI headquarters in Menlo Park, computer scientist Harry Bratt spoke into the microphone of his lab's new translation computer: 'Did you hear the explosion this morning?' Several seconds later, software written by SRI International scientists piped the question through the computer's speaker -- this time in the Iraqi dialect of Arabic. Saad Alabbodi, an Iraqi immigrant posing as a civilian being questioned by a U.S. soldier, answered in his native tongue. There was another pause as the computer translated Alabbodi's reply into English in a mock interrogation that provided another example of how technology is slowly mimicking complex human capabilities such as speech. [Go to the related podcast to hear the actual conversation.] ... 'One of the crying needs in Iraq is overcoming the language barrier,' said Kristin Precoda, director the SRI lab that developed the two-way translation system called IraqComm."
Google's Peter Norvig on managing the data deluge. Video of talk delivered on September 25, 2006 at UC Berkeley as part of the CITRIS Distinguished Speaker Series. "Researchers in computational linguistics and information retrieval now have a million times more data than was available 30 years ago. In this talk, Peter Norvig explores what this data can do for problems in language understanding, [statistical machine] translation, information extraction, and inference, and extrapolates to what more data may bring in the future." Machine Translation - Inching toward Human Quality. "In the News" article by Jan Krikke. IEEE Intelligent Systems (March/April 2006; 21(2): 4-6). "After 50 years of research and tinkering,machine translation might be ready to compete with human translators. Several companies have announced breakthroughs or substantial progress in MT research in recent months. ... MT requires complex cognitive operations to perform a seemingly mundane task: decoding a source text and recoding into the target language. The three common methods are rule-based MT(RBMT), statistical MT(SMT), and example-based MT (EBMT)."
An overview of machine translation, by John Hutchins (University of East Anglia, United Kingdom: updated January 2005), is available from the British Computer Society'sNatural Language Translation Specialist Group.
Machine Translation: An Introductory Guide. By Doug Arnold, Lorna Balkan, Siety Meijer, R.Lee Humphreys and Louisa Sadler (1994). "The topic of the book is the art or science of Automatic Translation, or Machine Translation (MT) as it is generally known --- the attempt to automate all, or part of the process of translating from one human language to another. The aim of the book is to introduce this topic to the general reader --- anyone interested in human language, translation, or computers." "The international center for Advanced Communication Technologies, interACT, is a joint center between the Universität (TH), Karlsruhe, Germany and Carnegie Mellon University, Pittsburgh, USA."
Scaling the Language Barrier. By Sebastian Rupley. PC Magazine (July 13, 2004). "In the annals of computer comedy, one of the most famous anecdotes is about asking a speech recognition engine, 'Recognize speech?' The translation comes back: 'Wreck a nice beach.' Getting machines to understand both spoken and written language has been an elusive goal for the tech industry for many years. Now, thanks to a wave of government funding and technical breakthroughs, machine translation (and understanding) of written language is getting unfunnier by the minute. ... The one clue Meaningful Machines has given about its software is that it will use new methods of statistically ranking the likelihood of what entire phrases mean, rather than just translating one word at a time. That allows it to discern whether the word baseball in a given phrase refers to a ball or a game. ... Carnegie Mellon University, the University of Southern California, and Microsoft Research operate some of the largest programs for developing machine translation software. Microsoft is primarily focused on extracting meaning from documents in English." E-translators - the more you say, the better, By Gregory M. Lamb. The Christian Science Monitor (April 22, 2004). "Universal translation is one of 10 emerging technologies that will affect our lives and work 'in revolutionary ways' within a decade, Technology Review says." Speech-to-Speech Translation. IBM Research. "The goal of the Speech-to-Speech Translation (S2S) research is to enable real-time, interpersonal communication via natural spoken language for people who do not share a common language. The Multilingual Automatic Speech-to-Speech Translator (MASTOR) system is the first S2S system that allows for bidirectional (English-Mandarin) free-form speech input and output.The research leading to MASTOR was initiated in 2001 as an IBM adventurous research project and was also selected to be funded by the Defense Advanced Research Projects Agency (DARPA) CAST program (formerly called 'Babylon' program). ... Construction of robust systems for speech-to-speech translation to facilitate cross-lingual oral communication has been the dream of speech and natural language researchers for decades. It is technically extremely difficult because of the need to integrate a set of complex technologies – Automatic Speech Recognition (ASR), Natural Language Understanding (NLU), Machine Translation (MT), Natural Language Generation (NLG), and Text-to-Speech Synthesis (TTS)...." Links to publications and additional information appear at the bottom of their page. Robo-talk helps pocket translator. By Jo Twist. BBC News (March 4, 2004). "Visitors landing at Tokyo's Narita Airport will be able to hire a device which can translate the local lingo. The speech-to-speech technology was developed by NEC, tested in Papero robots and then put in PDAs. ... As well as being able to understand and imitate human behaviour, Papero (Partner-Type Personal Robot), is the first robot to translate verbally between two languages in colloquial tongue. It can cope, in other words, with slang and local chatter, and has a vocabulary of 50,000 Japanese and 25,000 English travel and tourism related words." Computer aid ensures speedy, high-quality translations. IST Results (January 12, 2005). "Increasing translators' productivity is the goal of TransType2, an innovative computer-aided system that allows rapid and efficient high quality translations. Due to end in February, the 36-month IST programme project has drawn on two of the most commonly used translation technologies developed to date: Computer-Assisted Translation (CAT), in which human translators work in unison with a computer; and Machine Translation (MT), in which the computer handles the entire process. While both techniques have advantages and drawbacks, TransType2 has 'used the best of both worlds' says project manager José Esteban at Atos Origin in Spain." Software learns to translate by reading up. By Will Knight. NewScientist.com news service (February 22, 2005). "Translation software that develops an understanding of languages by scanning through thousands of previously translated documents has been released by US researchers. Most existing translation software uses hand-coded rules for transposing words and phrases. But the new software, developed by Kevin Knight and Daniel Marcu at the Information Sciences Institute, part of the University of Southern California, US, takes a statistical approach, building probabilistic rules about words, phrases and syntactic structures. The pair founded a company called Language Weaver in Los Angeles, US, to sell the software as an automated translation tool."
The Translation Challenge. By Chip Walter. Technology Review (June 2003). "Researchers are making progress today using three basic approaches drawn from natural-language processing. Knowledge-based machine translation, for example, relies on human programmers to write lists of rules that describe all possible relationships between verbs, nouns, prepositions, and so on for each language. ... A second approach, example-based systems, relies chiefly on raw computing power. ... Statistical techniques also depend on computing power to compare reams of previously translated text. However, this strategy selects the most likely translation using sophisticated mathematical models that the software continually upgrades based on how often its interpretations prove accurate." Another Step Closer to Artificial Intelligence. DW-WORLD.DE. (December 1, 2001) "This year's prestigious German Future Prize has been awarded to the inventor of an electronic translating device which brings humanity one step closer to the concept of Artificial Intelligence. ... [Professor Wolfgang] developed the 'Verbmobile'. This is essentially a computer that translates between German, English and Japanese."
The World Wide Translator. Will Web-wide "translation memory" finally make machine translation pay off? "Hour is the moment for all the good men to come to the subsidy of them country." By Alan Leo. MIT Technology Review (September 21, 2001). "'This whole area of language is extremely complex,' says IDC analyst Steve McClure. 'It's probably the most complicated problem in computer science that I'm aware of.' Computer-assisted translation typically involves two steps. First, a rules engine parses the original sentence, attempting to identify the relationships between the words. The engine then translates each word within the context that it believes to be correct-- often with mixed results." You can translate text of your choice by using free translators such as these from: AltaVista IBM Research Demonstrates Innovative 'Speech to Sign Language' Translation System. IBM press release via Market Wire. (September 13, 2007). "IBM (NYSE: IBM) has developed an ingenious system called SiSi (Say It Sign It) that automatically converts the spoken word into British Sign Language (BSL) which is then signed by an animated digital character or avatar. SiSi brings together a number of computer technologies. A speech recognition module converts the spoken word into text, which SiSi then interprets into gestures, that are used to animate an avatar which signs in BSL. ... This project is an example of IBM's collaboration with non-commercial organisations on worthy social and business projects. The signing avatars and the award-winning technology for animating sign language from a special gesture notation were developed by the University of East Anglia and the database of signs was developed by RNID (Royal National Institute for Deaf People). ... SiSi has been developed in the UK by a research team at IBM Hursley, as part of IBM's premier global student intern programme, Extreme Blue. In the European part of the programme, 80 of the most talented students from across Europe were selected to work on 20 projects and given whatever equipment, support and assistance they required. Working for an intense 12 week period alongside IBM technical and industry leaders, they focused on innovative technology projects, such as SiSi, all of which had real business value. ... For a video demonstration of the SiSi technology, visit the following url: http://youtube.com/watch?v=RarMKnjqzZU" The DePaul University American Sign Language (ASL) Synthesizer. "Combining computer technology and linguistics research to bridge the communication gap between the deaf and hearing worlds, our team of deaf and hearing researchers is working towards the realization of a digital English-to-ASL translator." Visit their site and meet "Paula," a virtual interpreter.
Speech, Language & Virtual Human Research at the School of Computing Sciences, University of East Anglia. Here's where you'll find Guido, a "virtual signer" (see this May 5, 2005 press release) and TESSA & VANESSA. GRASP - Recognising Auslan signs using Instrumented Gloves. Waleed Kadous' Honour Thesis (1995), School of Computer Science and Engineering at the University of New South Wales. "You may also be interested in my Machine Gesture and Sign Language Recognition page, but I have to warn you that it's a little out of date." Digital characters 'talk' to the deaf. By Jon Wurtzel. BBC (March 2, 2002). "Using digital avatars as signing translators could significantly expand the ways deaf and hard of hearing people communicate with the hearing world. The avatars are computer animations designed to look and move like real people. A computer program takes spoken English and converts it in real-time to text. The digital avatars then take this English text and sign its meaning on a display screen, in effect becoming a translator between spoken English and British sign language. ... Businesses should pursue this technology, and not just because it is the right thing to do. The deaf and hard of hearing account for 8.6 million of the 59 million people in the UK. Combine that with the millions throughout the world who would also benefit, and a huge market opportunity emerges for the right products."
Talking to Strangers. By Steve Silberman. Wired (May 2000; 8.05). "A renewed international effort is gearing up to design computers and software that smash language barriers and create a borderless global marketplace."
Lost in Translation. By Stephen Budiansky. The Atlantic Monthly (December 1998 / Volume 282, No. 6; pages 80 - 84). "In one famous episode in the British comedy series Monty Python a foreign-looking tourist clad in an outmoded leather trenchcoat appears at the entrance to a London shop. He marches up to the man behind the counter, solemnly consults a phrase book, and in a thick Middle European accent declares, 'My hovercraft ... is full of eels!' ... This episode is brought to mind by some recently available computer programs that claim to provide automatic translation between English and a number of other languages. Translation software that runs on mainframe computers has been used by government agencies for several decades, but with the advent of the Pentium chip, which packs the power of a mainframe into a desktop, such software can now easily be run on a personal computer."
"The Center for Machine Translation (CMT) is a research branch of the School of Computer Science [at Carnegie Mellon University] devoted to basic and applied research in all aspects of natural language processing, with a primary focus on machine translation, speech processing, and information retrieval. Containing a unique mix of academic and industrial researchers specializing in various aspects of computer science, artificial intelligence, computational linguistics and theoretical linguistics...."
Association for Machine Translation in the Americas. "AMTA is an association dedicated to anyone interested in the translation of languages using computers in some way. This includes people with translation needs, commercial system developers, researchers, sponsors, and people studying, evaluating, and understanding the science of machine translation (MT) and educating the public on important scientific techniques and principles involved. ... AMTA has members in Canada, Latin America, and the United States. It is the regional component of a worldwide network headed by the International Association for Machine Translation (IAMT)." Automating Knowledge Acquisition for Machine Translation. By Kevin Knight. AI Magazine, 18(4): Winter 1997, 81-96. "Machine translation of human languages (for example, Japanese, English, Spanish) was one of the earliest goals of computer science research, and it remains an elusive one. Like many AI tasks, translation requires an immense amount of knowledge about language and the world. Recent approaches to machine translation frequently make use of text-based learning algorithms to fully or partially automate the acquisition of knowledge. This article illustrates these approaches." Some DARPA projects:
Semantic Networks. By John Sowa. "A semantic network or net is a graphic notation for representing knowledge in patterns of interconnected nodes and arcs. Computer implementations of semantic networks were first developed for artificial intelligence and machine translation, but earlier versions have long been used in philosophy, psychology, and linguistics." Language Translation (TRL) at IBM. "This project deals with natural language analysis and translation by computer. Technologies used for machine translation, such as syntactic parsing and word sense disambiguation, are commonly used for other applications of natural language processing."
|

