Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama
We address the problem of finding useful regions for two-dimensional association rules and decision trees. In a previous paper we presented efficient algorithms for computing optimized x-monotone regions, whose intersections with any vertical line are always undivided. In practice, however, the quality of x-monotone regions is not ideal, because the boundary of an x-monotone region tends to be notchy, and the region is likely to overfit a training dataset too much to give a good prediction for an unseen test dataset. In this paper we instead propose the use of a recti-linear region whose intersection with any vertical line and whose intersection with any horizontal line are both undivided, so that the boundary of any rectilinear region is never notchy. This property is studied from a theoretical viewpoint. Experimental tests confirm that the rectilinear region less overfits a training database and therefore provides a better prediction for unseen test data. We also present a novel efficient algorithm for computing optimized rectilinear regions.