Pattern-Based Clustering for Database Attribute Values

Matthew Merzbaehter and Wesley W. Chu

We present a method for automatically clustering similar attribute values in a database system spanning mulitple domains. The method constructs an aftribute abstraction hierarchy for each attribute using rules that are derived from the database instance. The rules have a confidence and popularity that combine to express the "usefullness" of the rule. Attribute values are clustered if they are used as the premise for rules with the same consequence. By iteratively applying the algorithm, a hierarchy of clusters can be found. The algorithm can be improved by allowing domain expert supervision during the clustering process. An example as well as experimental results from a large transportation database are included.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.