Michael S. Register and Anil Rewari
Canasta (crash analysis troubleshooting assistant) is a Digital proprietary knowledge-based system developed by the Artificial Intelligence Applications Group (AIAG) at Digital Equipment Corporation in collaboration with Digital’s customer support centers (CSCs). It is targeted to assist computer support engineers at CSCs in analyzing operating system crashes, traditionally one of the most complex types of problems reported by customers. Digital started work on Canasta in January 1988. A version of Canasta that assists in the analysis of vms operating system crashes was successful: It is currently deployed at CSCs in over 20 countries and is used to resolve over 850 crash-related customer calls each month. It is estimated that in time savings alone, it saves Digital over 2 million dollars each year. Canasta’s success largely results from the innovative way in which it integrated different problem-solving modules that model the different types of problem-resolution strategies that experts use in this domain. These strategies include making quick checks (rule based) on whether the crash at hand is because of a known cause, using deeper analysis (decision tree-based) reasoning to resolve new types of crash problems, and checking for similarities among unresolved cases (a form of case-based generalization) that can lead to the identification of new hardware and software bugs. In fact, Canasta’s unresolved crash processor distinguishes it from other expert systems: It directly assists the expert in the generation of new knowledge regarding crash-causing bugs. canasta also integrates different technologies that have not been combined before in this domain. It integrates a remote scripting package and rule-based inference to provide sophisticated automatic data collection that allows it to automatically gather data from the customer’s machine thousands of miles away. It uses a rule-based system for quick checks on known problems. It uses a tool that allows experts to quickly encode troubleshooting knowledge graphically in the form of decision trees. It uses database technology to store case-related information that can be accessed later. Canasta also includes an innovative distributed knowledge maintenance system that automatically collects knowledge from experts worldwide at all CSCs and automatically validates and redistributes this knowledge to all other sites. This approach facilitates the sharing of knowledge across various geographic sites.