Bruno N. da Silva
We study the problem of learning from disagreeing demonstrators. We present a model that suggests how it might be possible to design an incentive-compatible mechanism that combines demonstrations from human agents who disagree on the evaluation of the demonstrated task. Apart from comonotonicity of preferences over atomic outcomes, we make no assumptions over the preferences of our demonstrators. We then suggest that a reputation mechanism is sufficient to elicit cooperative behavior from otherwise competitive human agents.