Jan M. Zytkow
Research on knowledge discovery in databases (KDD) has been impeded by a limited vision of knowledge, inherited from machine learning (ML) and other branches of computer science. In contrast with KDD and ML, research on automation of scientific discovery (SD) took from natural sciences a broader perspective on knowledge. We analyze the typical ML view of discovery as supervised and unsupervised classification; the former viewed as concept learning, while the latter as clustering and formation of concept hierarchies. We suggest a number of steps that lead beyond concept definitions, towards a more meaningful knowledge. We argue that a narrow view of knowledge is accompanied by a narrow view of the discovery method. Systems that learn concepts, find clusters or build taxonomies, stay on a single task, even if the results are poor, while an autonomous discoverer should be able to conclude that a given hypotheses space does not match the data and move the search to other spaces. As an example we consider taxonomy formation which results in a reasoned choice between no taxonomy, one taxonomy, and several taxonomies. Finally, we briefly argue that SD can provide KDD with a broader vision of knowledge and discovery method.