An Approach to Evaluate AI Commonsense Reasoning Systems
Stellan Ohlsson, Robert H. Sloan, Gyorgy Turan, Daniel Uber, Aaron Urasky

We propose and give a preliminary test of a new metric for the quality of the commonsense knowledge and reasoning of large AI databases: Using the same measurement as is used for a four-year-old, namely, an IQ test for young children. We report on results obtained us- ing test questions we wrote in the spirit of the questions of the Wechsler Preschool and Primary Scale of Intelligence, Third Edition (WPPSI-III) on the ConceptNet system, which were, on the whole, quite strong.


commonsense reasoning; psychometrics; IQ testing; AI systems

