Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance
Decisions made by human-AI teams (e.g., AI-advised humans) are increasingly common in high-stakes domains such as healthcare, criminal justice, and finance. Achieving high team performance depends on more than just the accuracy of the AI system: Since the human and the AI may have different expertise, the highest team performance is often reached when they both know how and when to complement one another. We focus on a factor that is crucial to supporting such complementary: the human’s mental model of the AI capabilities, specifically the AI system’s error boundary (i.e. knowing “When does the AI err?”). Awareness of this lets the human decide when to accept or override the AI’s recommendation. We highlight two key properties of an AI’s error boundary, parsimony and stochasticity, and a property of the task, dimensionality. We show experimentally how these properties affect humans’ mental models of AI capabilities and the resulting team performance. We connect our evaluations to related work and propose goals, beyond accuracy, that merit consideration during model selection and optimization to improve overall human-AI team performance.