AAAI Publications, Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Font Size: 
Defining Human Values for Value Learners
Kaj Sotala

Last modified: 2016-03-29

Abstract


Hypothetical “value learning” AIs learn human values and then try to act according to those values. The design of such AIs, however, is hampered by the fact that there exists no satisfactory definition of what exactly human values are. After arguing that the standard concept of preference is insufficient as a definition, I draw on reinforcement learning theory, emotion research, and moral psychology to offer an alternative definition. In this definition, human values are conceptualized as mental representations that encode the brain’s value function (in the reinforcement learning sense) by being imbued with a context-sensitive affective gloss. I finish with a discussion of the implications that this hypothesis has on the design of value learners.

Keywords


machine ethics; value learning; reinforcement learning; human values

Full Text: PDF