Oren Etzioni and Daniel Weld
Even before the advent of Artificial Intelligence, science fiction writer Isaac Asimov recognized that a robot must place the protection of humans from harm at a higher priority than obeying human orders. Inspired by Asimov, we pose the following fundamental questions: (1) How should one formalize the rich, but informal, notion of "harm"? (2) How can an agent avoid performing harmful actions, and do so in a computationally tractable manner? (3) How should agent resolve conflict between its goals and the need to avoid harm? (4) When should an agent prevent human from harming herself? While we address some of these questions in technical detail, the primary goal of this paper is to focus attention on Asimov’s concern: society will reject autonomous agents unless we have some credible means of making them safe!