P. S. Dutta, N. R. Jennings, and L. Moreau
Effective coordination of agents' actions in partially-observable domains is a major challenge of multi-agent systems research. To address this, many researchers have developed techniques that allow the agents to make decisions based on estimates of the states and actions of other agents that are typically learnt using some form of machine learning algorithm. Nevertheless, many of these approaches fail to provide an actual means by which the necessary information is made available so that the estimates can be learnt. To this end, we argue that cooperative communication of state information between agents is one such mechanism. However, in a dynamically changing environment, the accuracy and timeliness of this communicated information determine the fidelity of the learned estimates and the usefulness of the actions taken based on these. Given this, we propose a novel information-sharing protocol, post-task-completion sharing, for the distribution of state information. We then show, through a formal analysis, the improvement in the quality of estimates produced using our strategy over the widely used protocol of sharing information between nearest neighbours. Moreover, communication heuristics designed around our information-sharing principle are subjected to empirical evaluation along with other benchmark strategies (including Littman's Q-routing and Stone's TPOT-RL) in a simulated call-routing application. These studies, conducted across a range of environmental settings, show that, compared to the different benchmarks used, our strategy generates an improvement of up to 60% in the call connection rate; of more than 1000% in the ability to connect long-distance calls; and incurs as low as 0.25 of the message overhead.