TOOLBOXBROWSE TOPICS
RESOURCESABOUT THIS SITEpmwiki.org |
Natural Language
The value to our society of being able to communicate with computers in everyday "natural" language cannot be overstated. Imagine asking your computer "Does this candidate have a good record on the environment?" or "When is the next televised National League baseball game?" Or being able to tell your PC "Please format my homework the way my English professor likes it." Commercial products can already do some of these things, and AI scientists expect many more in the next decade. One goal of AI work in natural language is to enable communication between people and computers without resorting to memorization of complex commands and procedures. Automatic translation---enabling scientists, business people and just plain folks to interact easily with people around the world---is another goal. Both are just part of the broad field of AI and natural language, along with the cognitive science aspect of using computers to study how humans understand language. Good Places to StartWhat is NLP. From the Natural Language Processing Research Group at the University of Sheffield Department of Computer Science. " Natural Language Processing (NLP) is both a modern computational technology and a method of investigating and evaluating claims about human language itself. Some prefer the term Computational Linguistics in order to capture this latter function, but NLP is a term that links back into the history of Artificial Intelligence (AI), the general study of cognitive function by computational processes, normally with an emphasis on the role of knowledge representations, that is to say the need for representations of our knowledge of the world in order to understand human language with computers. Natural Language Processing (NLP) is the use of computers to process written and spoken language for some practical, useful, purpose: to translate languages, to get information from the web on text data banks so as to answer questions, to carry on conversations with machines, so as to get advice about, say, pensions and so on. These are only examples of major types of NLP, and there is also a huge range of lesser but interesting applications, e.g. getting a computer to decide if one newspaper story has been rewritten from another or not. NLP is not simply applications but the core technical methods and theories that the major tasks above divide up into, such as Machine Learning techniques, which is automating the construction and adaptation of machine dictionaries, modeling human agents' beliefs and desires etc. This last is closer to Artificial Intelligence, and is an essential component of NLP if computers are to engage in realistic conversations: they must, like us, have an internal model of the humans they converse with." What is Computational Linguistics? Hans Uszkoreit, CL Department, University of the Saarland, Germany. 2000. A short, non-technical overview of this exciting field. Natural Language Learning at UT Austin. "Natural language processing systems are difficult to build, and machine learning methods can help automate their construction significantly. Our research in learning for natural language mainly involves applying inductive logic programming and other relational learning techniques to constructing database interfaces and information extraction systems from supervised examples. However, we have also conducted research in learning for syntactic parsing, machine translation, word-sense disambiguation, and morphology (past tense generation)." Links to many relevant articles. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. By Daniel Jurafsky and James H. Martin. Prentice-Hall, 2000. Both the Preface and Chaper 1 are available online as are the resources for all of the chapters. Natural Language. A summary by Patrick Doyle. Very informative, though there are some spots that are quite technical. Glossary of Linguistic Terms. Compiled by Dr. Peter Coxhead of The University of Birmingham School of Computer Science for his students. The Futurist - The Intelligent Internet. The Promise of Smart Computers and E-Commerce. By William E. Halal. Government Computer News Daily News (June 23, 2004). "Scientific advances are making it possible for people to talk to smart computers, while more enterprises are exploiting the commercial potential of the Internet. ... [F]orecasts conducted under the TechCast Project at George Washington University indicate that 20 commercial aspects of Internet use should reach 30% 'take-off' adoption levels during the second half of this decade to rejuvenate the economy. Meanwhile, the project's technology scanning finds that advances in speech recognition, artificial intelligence, powerful computers, virtual environments, and flat wall monitors are producing a 'conversational' human-machine interface. These powerful trends will drive the next generation of information technology into the mainstream by about 2010. ... The following are a few of the advances in speech recognition, artificial intelligence, powerful chips, virtual environments, and flat-screen wall monitors that are likely to produce this intelligent interface. ... IBM has a Super Human Speech Recognition Program to greatly improve accuracy, and in the next decade Microsoft's program is expected to reduce the error rate of speech recognition, matching human capabilities. ... MIT is planning to demonstrate their Project Oxygen, which features a voice-machine interface. ... Amtrak, Wells Fargo, Land's End, and many other organizations are replacing keypad-menu call centers with speech-recognition systems because they improve customer service and recover investment in a year or two. ... General Motors OnStar driver assistance system relies primarily on voice commands, with live staff for backup; the number of subscribers has grown from 200,000 to 2 million and is expected to increase by 1 million per year. The Lexus DVD Navigation System responds to over 100 commands and guides the driver with voice and visual directions." Artificial Intelligence. [Radio broadcast; audio available.] Reported by Shay Zeller for The Front Porch. New Hampshire Public Radio (July 12, 2006). "Dartmouth College is celebrating 50 years of Artificial Intelligence this week with a special conference that takes a look forward and a look back at the field. We'll find out how AI has evolved since its inception and how far scientists have come to creating the technological brain that's been depicted in science fiction for decades. We'll also look at the philosophical and ethical questions that go along with creating machines that emulate the human mind. Our guest are: Eugene Charniak, professor of Computer Science at Brown University. Charniak's expertise is in language development, and he's presenting a speech at the conference entitled 'Why Natural Language Processing is Now Statistical Natural Language Processing.' James H. Moor, professor of Philosophy at Dartmouth. He's the conference's main organizer." Experts Use AI to Help GIs Learn Arabic. By Eric Mankin. USC News (June 21, 2004). " To teach soldiers basic Arabic quickly, USC computer scientists are developing a system that merges artificial intelligence with computer game techniques. The Rapid Tactical Language Training System, created by the USC Viterbi School of Engineering's Center for Research in Technology for Education (CARTE) and partners, tests soldier students with videogame missions in animated virtual environments where, to pass, the students must successfully phrase questions and understand answers in Arabic." Read the story and then watch the video! Natural Language Processing FAQ. Maintained by Dragomir R. Radev. Dept. of Computer Science, Columbia University. An Overview of Empirical Natural Language Processing. By Eric Brill and Raymond J. Mooney (1997). AI Magazine 18(4): Winter 1997, 13-24. "In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, information extraction, and machine translation. This article presents an introduction to the series of specialized articles on these topics and attempts to describe and explain the growing interest in using learning methods to aid the development of natural language processing systems." Visit the homepage of Daniel Klein, Assistant Professor,Computer Science Division,University of California at Berkeley, and recipient of "the [2006] Grace Murray Hopper Award for the design of the first machine learning system capable of inferring a high-quality grammar for English and other languages directly from text without human annotations or supervision." (See the March 2007 ACM press release.) Chatbots / Chatterbots
Whatever happened to machines that think? By Justin Mullins. New Scientist (April 23, 2005; Issue 2496: pages 32 - 37). "Clever computers are everywhere. From robotic lawnmowers to intelligent lighting, washing machines and even car engines that self-diagnose faults, there's a silicon brain in just about every modern device you can think of. But can you honestly call any machine intelligent in a meaningful sense of the word? One rainy afternoon last February I decided to find out. I switched on the computer in my study, and logged on to www.intellibuddy.com, home to one of the leading artificial intelligences on the planet, to see what the state-of-the-art has to offer. ..." Readings Online"Computational Linguistics is the only publication devoted exclusively to the design and analysis of natural language processing systems. From this unique quarterly, university and industry linguists, computational linguists, artificial intelligence (AI) investigators, cognitive scientists, speech specialists, and philosophers get information about computational aspects of research on language, linguistics, and the psychology of language processing and performance. Published by The MIT Press for: The Association for Computational Linguistics." Abstracts are available online. Natural Language Understanding. By Avron Barr (1980). AI Magazine 1(1): 5-10. "This is an excerpt from the Handbook of Artificial Intelligence, a compendium of hundreds of articles about AI ideas, techniques, and programs being prepared at Stanford University by AI researchers and students from across the country." Don't miss the fascinating section: Early History. Empirical Methods in Information Extraction. By Claire Cardie (1997). AI Magazine 18 (4): 65-79. "This article surveys the use of empirical, machine-learning methods for a particular natural language-understanding task-information extraction. The author presents a generic architecture for information-extraction systems and then surveys the learning algorithms that have been developed to address the problems of accuracy, portability, and knowledge acquisition for each component of the architecture." Duo-Mining - Combining Data and Text Mining. By Guy Creese. DMReview.com (September 16, 2004). "As standalone capabilities, the pattern-finding technologies of data mining and text mining have been around for years. However, it is only recently that enterprises have started to use the two in tandem - and have discovered that it is a combination that is worth more than the sum of its parts. First of all, what are data mining and text mining? They are similar in that they both 'mine' large amounts of data, looking for meaningful patterns. However, what they analyze is quite different. ... Collections and recovery departments in banks and credit card companies have used duo-mining to good effect. Using data mining to look at repayment trends, these enterprises have a good idea on who is going to default on a loan, for example. When logs from the collection agents are added to the mix, the understanding gets even better. For example, text mining can understand the difference in intent between, 'I will pay,' 'I won't pay,' 'I paid' and generate a propensity to pay score - which, in turn, can be data mined. To take another example, if a customer says, 'I can't pay because a tree fell on my house;' all of a sudden it is clear that it's not a 'bad' delinquency - but rather a sales opportunity for a home loan." Introduction to Natural Language Processing. This courseware from Professor Jason Eisner can be accessed via a link from its SIGCSE Education Resources page. At I.B.M., That Google Thing Is So Yesterday. By James Fallows. The New York Times (December 26, 2004; reg. req'd.). "Suddenly, the computer world is interesting again. ... The most attractive offerings are free, and they are concentrated in the newly sexy field of 'search.' ... [T]oday's subject is the virtually unpublicized search strategy of another industry heavyweight: I.B.M. ... I.B.M. says that its tools will make possible a further search approach, that of 'discovery systems' that will extract the underlying meaning from stored material no matter how it is structured (databases, e-mail files, audio recordings, pictures or video files) or even what language it is in. The specific means for doing so involve steps that will raise suspicions among many computer veterans. These include 'natural language processing,' computerized translation of foreign languages and other efforts that have broken the hearts of artificial-intelligence researchers through the years. But the combination of ever-faster computers and ever-evolving programming allowed the systems I saw to succeed at tasks that have beaten their predecessors. ... ... Jennifer Chu-Carroll of I.B.M. demonstrated a system called Piquant, which analyzed the semantic structure of a passage and therefore exposed 'knowledge' that wasn't explicitly there. After scanning a news article about Canadian politics, the system responded correctly to the question, 'Who is Canada's prime minister?' even though those exact words didn't appear in the article."
dialogues with colorful personalities of early ai. By Guven Guzeldere and Stefano Franchi. (1995). From Constructions of the Mind: Artificial Intelligence and the Humanities, a special issue of the Stanford Humanities Review, Volume 4,Issue 2. "Of all the legacies of the era of the sixties, three colorful, not to say garrulous, "personalities" that emerged from the early days of artificial intelligence research are worth mentioning: ELIZA, the Rogerian psychotherapist; PARRY, the paranoid; and (as part of a younger generation) RACTER, the "artificially insane" raconteur. All three of these "characters" are natural language processing systems that can "converse" with human beings (or with one another) in English.
LifeCode: A Deployed Application for Automated Medical Coding. By Daniel T. Heinze, Mark Morsch, Ronald Sheffer, Michelle Jimmink, Mark Jennings, William Morris, and Amy Morsch. AI Magazine 22(2): 76-88 (Summer 2001). This paper is based on the authors' presentation at the Twelfth Innovative Applications of Artificial Intelligence Conference (IAAI-2000). "LifeCode is a natural language processing (NLP) and expert system that extracts demographic and clinical information from free-text clinical records." Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. By Daniel Jurafsky and James H. Martin. Prentice-Hall, 2000. The Preface and Chapter 1 are available online. "I'm sorry Dave, I'm afraid I can't do that": Linguistics, Statistics, and Natural Language Processing circa 2001. By Lillian Lee, Cornell Natural Language Processing Group. In Computer Science: Reflections on the Field, Reflections from the Field (Report of the National Academies' Study on the Fundamentals of Computer Science), pp. 111-118, 2004.
A Performance Evaluation of Text-Analysis Technologies. By Wendy Lehnert and Beth Sundheim (1991). AI Magazine 12 (3): 81-94. "A performance evaluation of 15 text-analysis systems conducted to assess the state of the art for detailed information extraction from unconstrained continuous text. ... Based on multiple strategies for computing each metric, the competing systems were evaluated for recall, precision, and overgeneration. The results support the claim that systems incorporating natural language-processing techniques are more effective than systems based on stochastic techniques alone." Artificial Intelligence. Available from MIT OpenCourseWare. Natural Language Understanding and Semantics. Section 1.2.4 of Chapter One (available online) of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005). "One of the long-standing goals of artificial intelligence is the creation of programs that are capable of understanding and generating human language. Not only does the ability to use and understand natural language seem to be a fundamental aspect of human intelligence, but also its successful automation would have an incredible impact on the usability and effectiveness of computers themselves. ... Understanding natural language involves much more than parsing sentences into their individual parts of speech and looking those words up in a dictionary. Real understanding depends on extensive background knowledge about the domain of discourse and the idioms used in that domain as well as an ability to apply general contextual knowledge to resolve the omissions and ambiguities that are a normal part of human speech." Getting Started on Natural Language Processing with Python. By Nitin Madnani. Crossroads, The ACM Student Magazine 13(4) Fall 2007. "The intent of this article is to introduce readers to the area of natural language processing, commonly referred to as NLP. However, rather than just describing the salient concepts of NLP, this article uses the Python programming language to illustrate them as well. For readers unfamiliar with Python, the article provides a number of references to learn how to program in Python." Computers That Speak Your Language - Voice recognition that finally holds up its end of a conversation is revolutionizing customer service. Now the goal is to make natural language the way to find any type of information, anywhere. By Wade Roush. Technology Review (June 2003). "Building a truly interactive customer service system like Nuance’s requires solutions to each of the major challenges in natural-language processing: accurately transforming human speech into machine-readable text; analyzing the text’s vocabulary and structure to extract meaning; generating a sensible response; and replying in a human-sounding voice." Computing's too important to be left to men. BCS managing editor Brian Runciman interviewed Karen Sparck-Jones, winner of the 2007 BCS Lovelace Medal. The British Computer Society (March 2007). "[Q] By way of introduction, can you tell us something about your work? [A] In some respects I'm not a central computing person, on the other hand the area I've worked in has become more central and important to computing. I've always worked in what I like to call natural language information processing. That is to say dealing with information in natural language and information that is conveyed by natural language, because that's what we use. ..." Chatbot bids to fool humans - A computer program designed to talk like a human is preparing for its biggest test in its bid to be truly "intelligent". By Jo Twist. BBC (September 22, 2003). "Jabberwacky lives on a computer hard drive, tells jokes, uses slang, sometimes swears and can be quite a confrontational conversationalist. What sets this chatty AI (artificial intelligence) chatbot apart from others is the more it natters, the more it learns. The bot is the only UK finalist in this year's Loebner Prize and is hoping to chat its way to a gold medal for its creator, Rollo Carpenter. The Loebner Prize is the annual competition to find the computer with the most convincing conversational skills and started in 1990. Jabberwacky will join eight other international finalists in October, when they pit their wits against flesh and blood judges to see if they can pass as one of them. It is the ultimate Turing Test, which was designed by mathematician Alan Turing to see whether computers 'think' and have 'intelligence'."
Related Web SitesThe Association for Computational Linguistics (ACL) is the "international scientific and professional society for people working on problems involving natural language and computation." ACL NLP/CL Universe. Choose "Browse" to see menus of what is offered (introductory materials, research groups, conferences, bibliographies, etc.) or choose "Search" for a keyword search engine. ["The NLP/CL Universe is a Web catalog/search engine that is devoted to Natural Language Processing and Computational Linguistics Web sites. It exists since March 18, 1995." Maintained by Dragomir R. Radev for ACL.] AI on the Web: Natural Language Processing. A resource companion to Stuart Russell and Peter Norvig's "Artificial Intelligence: A Modern Approach" with links to reference material, people, research groups, books, companies and much more. National Centre for Text Mining (NaCTeM): "We provide text mining services in response to the requirements of the UK academic community. Our initial focus is on applications in the biological and medical domains, where the major successes in the mining of scientific texts have so far occurred. We also make significant contributions to the text mining research community, both nationally and internationally."
Natural Language Group. Information Sciences Institute, University of Southern California. Natural Language Processing, one of BBN Technologies' "advanced technology solutions." The Natural Language Processing Dictionary (NLP Dictionary). Compiled by Bill Wilson, Associate Professor in the Artificial Intelligence Group, School of Computer Science and Engineering, University of NSW. "You should use The NLP Dictionary to clarify or revise concepts that you have already met. The NLP Dictionary is not a suitable way to begin to learn about NLP." Natural Language Processing Group, Cornell University. Natural Language Processing Group, Department of Artificial Intelligence, University of Edinburgh. "The goal of the [Microsoft] Natural Language Processing (NLP) group is to design and build a computer system that will analyze, understand, and generate languages that humans use naturally, so that eventually you can address your computer as though you were addressing another person. This goal is not easy to reach. ... The challenges we face stem from the highly ambiguous nature of natural language." Natural Language Processing Laboratory, University of Pittsburgh. "We are pursuing research in a wide range of natural language processing problems, including discourse and dialogue, spoken language processing, affective computing, natural language learning, statistical parsing, and machine translation." Be sure to check out thier projects. Natural Language Processing Research Group at the University of Sheffield Department of Computer Science. Natural Language Program. Artificial Intelligence Center, SRI. "The SRI AI Center Natural Language Program does research on natural language processing theory and applications. The Program has three subgroups. Multimedia/Multimodal Interfaces ... Spoken Language Systems ... Written Language Systems." Be sure to follow their links to projects, applications, and more! "The Natural Language Software Registry (NLSR) [fourth edition] is a concise summary of the capabilities and sources of a large amount of natural language processing (NLP) software available to the NLP community. It comprises academic, commercial and proprietary software with specifications and terms on which it can be acquired clearly indicated." From the Language Technology Lab of the German Research Centre for Artificial Intelligence (DFKI GmbH). The North American Computational Linguistics Olympiad (NAMCLO): "Like former [Linguistics] Olympiads, NAMCLO is a Linguistics contest. It challenges you to demonstrate your ability to understand and analyze human language. Unlike former contests, however, the NAMCLO focuses on Computational Linguistics problems, in addition to general linguistic ones." In addition to contest information, the site offers resources such as:
START. "The START Natural Language System is a software system designed to answer questions that are posed to it in natural language. START parses incoming questions, matches the queries created from the parse trees against its knowledge base and presents the appropriate information segments to the user. In this way, START provides untrained users with speedy access to knowledge that in many cases would take an expert some time to find." Stanford NLP Group. "A distinguishing feature of the Stanford NLP Group is our effective combination of sophisticated and deep linguistic modeling and data analysis with innovative probabilistic and machine learning approaches to NLP."
The Turing Center: "a multidisciplinary research center at the University of Washington, investigating problems at the crossroads of natural language processing, data mining, Web search, and the Semantic Web. ... Our mission is to advance the philosophy, science, and technology of pan-lingual communication and collaboration among human and artificial agents." Xerox Research Centre Europe (XRCE) - Parsing & Semantics: "ParSem concentrates on automatically making sense of electronic documents, by semantically analyzing them. ParSem concentrates on two main research lines of natural language processing: robust parsing and semantics."
Related AI Topics Pages
More ReadingsAikins, Janice, Rodney Brooks, William Clancey, et al. 1981. Natural Language Processing Systems. In The Handbook of Artificial Intelligence, Vol. I, ed.Barr, Avron and Edward A. Feigenbaum, 283-321. Stanford/Los Altos, CA: HeurisTech Press/William Kaufmann, Inc. Allen, J. F. 1994. Natural Language Understanding. Redwood City, CA: Benjamin/Cummings. A new edition of a classic work. Bobrow, Daniel. 1968. Natural Language Input for a Computer Problem Solving System. In Semantic Information Processing, ed. Minsky, Marvin, 133-215. Cambridge, MA: MIT Press. Charniak, E. 1993. Statistical Language Learning. Cambridge, MA: MIT Press. Cohen, P., J. Morgan, and M. Pollack. 1990. Intentions in Communication. Cambridge, MA: MIT Press. Grosz, Barbara J., Martha E. Pollack, and Candace L. Sidner. 1989. Discourse. In Foundations of Cognitive Science, ed. Posner, M., 437-468. Cambridge, MA: MIT Press. Grosz, Barbara J., Karen Sparck Jones, and Bonnie L. Webber, editors. 1986. Readings in Natural Language Processing. San Mateo, CA: Morgan Kaufmann. Mahesh, Kavi, and Sergei Nirenburg. 1997. Knowledge-Based Systems for Natural Language. In The Computer Science and Engineering Handbook, ed. Allen B. Tucker, Jr., 637-653. Boca Raton, FL: CRC Press, Inc. McKeown, K., and W. Swartout. 1987. Language Generation and Explanation. In Annual Review of Computer Science, Vol. 2, Palo Alto, CA: Annual Reviews. Patterson, Dan W. 1990. Natural Language Processing. In Introduction to Artificial Intelligence and Expert Systems by Dan W. Patterson, 227-270. Englewood Cliffs, NJ: Prentice Hall. Shank, Roger C. 1975. The Structure of Episodes in Memory. In Computation and Intelligence: Collected Readings, ed. Luger, George F., 236-259. Menlo Park/Cambridge, MA: AAAI Press/The MIT Press, 1995. Weizenbaum, J. 1965. ELIZA--A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communications of the ACM, 9 (1): 36-45. A pioneering work. Winograd, T. 1972. Understanding Natural Language. New York: Academic Press. A pioneering work. |
