Jiawei Han and YongJian Fu
Concept hierarchies organize data and concepts in hierarchical ]orms or in certain partial order, which helps expressing knowledge and data relationships in databases in concise, high level terms, and thus, plays an important role in knowledge discovery processes. Concept hierarchies could be provided by knowledge engineers, domain experts or users, or embedded in some data relations. However, it is sometimes desirable to automatically generate some concept hierarchies or adjust some given hierarchies for particular learning tasks. In this paper, the issues o] dynamic generation and refinement of concept hierarchies are studied. The study leads to some algorithms for automatic generation of concept hierarchies ]or numerical attributes based on data distributions and for dynamic refinement of a given or generated concept hierarchy based on a learning request, the relevant set of data and database statistics. These algorithms have been implemented in the DBLearn knowledge discovery system and tested against large relational databases. The experimental results show that the algorithms are efficient and effective ]or knowledge discovery in large databases.