On Evaluating Artificial Intelligence Systems for Medical Diagnosis
Among the difficulties in evaluating AI-type medical diagnosis systems are: the intermediate conclusions of the AI system need to be looked at in addition to the "final " answer ; the "superhuman human" fallacy must be guarded against ; and methods for estimating how the approach will scale upwards to larger domains are needed. We propose to measure both the accuracy of diagnosis and the structure of reasoning, the latter with a view to gauging how well the system will scale up.
Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.