Richard Maclin, Edward Wild, Jude Shavlik, Lisa Torrey, Trevor Walker
Knowledge-based classification and regression methods are especially powerful forms of learning. They allow a system to take advantage of prior domain knowledge supplied either by a human user or another algorithm, combining that knowledge with data to produce accurate models. A limitation of the use of prior knowledge occurs when the provided knowledge is incorrect. Such knowledge likely still contains useful information, but knowledge-based learners might not be able to fully exploit such information. In fact, incorrect knowledge can lead to poorer models than result from knowledge-free learners. We present a support-vector method for incorporating and refining domain knowledge that not only allows the learner to make use of that knowledge, but also suggests changes to the provided knowledge. Our approach is built on the knowledge-based classification and regression methods presented by Fung, Mangasarian, and Shavlik (2002; 2003) and by Mangasarian, Shavlik, and Wild (2004). Experiments on artificial data sets with known properties, as well as on a real-world data set, demonstrate that our method learns more accurate models while also adjusting the provided rules in intuitive ways. Our new algorithm provides an appealing extension to knowledge-based, support-vector learning that is not only able to combine knowledge from rules with data, but is also able to use the data to modify and change those rules to better fit the data.
Subjects: 12. Machine Learning and Discovery; Please choose a second document classification
Submitted: Apr 24, 2007