John Elder, Daryl Pregibon
The quest to find models usefully characterizing data is a process central to the scientific method and has been carried out on many fronts. Researchers from an expanding number of fields have designed algorithms to discover rules or equations that capture key relationships between variables in a database. Some modern heuristic modeling approaches seem to have gained in popularity partly as a way to "avoid statistics" while still addressing challenging induction tasks. Yet, there are useful distinctives in what may be called a "statistical viewpoint," and we review here some major advances in statistics from recent decades that are applicable to Knowledge Discovery in Databases.