Privacy Preserving Planning in Stochastic Environments
Collaborative privacy preserving planning (cppp) has gained much attention in the past decade. To date, cppp has focused on domains with deterministic action effects. In this paper, we extend cppp to domains with stochastic action effects. We show how such environments can be modeled as an mdp. We then focus on the popular Real-Time Dynamic Programming (RTDP) algorithm for computing value functions for mdps, extending it to the stochastic cppp setting. We provide two versions of RTDP: a complete version identical to executing centralized RTDP, and an approximate version that sends significantly fewer messages and computes competitive policies in practice. We experiment on domains adapted from the deterministic cppp literature.