Integrating and Mining Distributed Customer Databases

Ira Haimowitz, Ozden Gur-Ali, Henry Schwarz

Large corporations often have different subunits sharing common customers, yielding distributed customer databases. Corporate risk and marketing functions seek areas where there is unusually high risk, or where one can target market. We present a three-phase process to solve this problem. First, we merge the distributed databases using decision tree induction into a database of unique customers, labeled by location, industry code, and financial parameters. Second, we reduce the customer table to three explanatory business factors and various outcome measures. An ANOVA Model identifies outstanding effects and outliers. By incorporating both main and interaction effects, this approach identifies outliers that are more likely to be interesting than would be found using only main effects. Third, we display the aberrations as peaks or valleys so a user can isolate opportunities. This framework approximates an interestingness filter.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.