Learning from Biased Data Using Mixture Models

Authors

A. J. Feelders

Track:

All Contents

Downloads:

Abstract:

Data bases sometimes contain a non-random sample from the population of interest. This complicates the use of extracted knowledge for predictive purposes. We consider a specific type of biased data that is of considerable practical interest, namely non-random partially classified data. This type of data typically results when some screening mechanism determines whether the correct class of a particular case is known. In credit scoring the problem of learning from such a biased sample is called "reject inference," since the class label (e.g. good or bad loan) of rejected loan applications is unknown. We show that maximum likelihood estimation of so called mixture models is appropriate for this type of data, and discuss an experiment performed on simulated data using mixtures of normal components. The benefits of this approach are shown by making a comparison with the results of sample-based discriminant analysis. Some directions are given how to extend the analysis to allow for non-normal components and missing attribute values in order to make it suitable for "real-life" biased data.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.