Learning Model Parameters for Decentralized Schedule-Driven Traffic Control
Model-based intersection optimization strategies that produce signal timings over a specified optimization horizon have been widely investigated for urban traffic signal control, and recent work in this area produced a scalable approach to real-time traffic control based on a decentralized schedule-driven optimization model. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection through local sensors and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. Intersections then exchange schedule information with their neighbors to achieve network level coordination. Although the approach is general and has demonstrated substantial success, its effectiveness in a given road network depends on the extent to which various parameters of the model, e.g, maximum green time, are adjusted to match that network's actual flow conditions over time. To address this problem, we propose a two-stage hierarchical structure that combines online planning and reinforcement learning. Reinforcement learning is applied to adjust the parameters of the model over a longer time-scale. On the other hand, online planning is used to compute the schedule for managing the traffic signals in the shorter term. We demonstrate how this hybrid approach outperforms the original approach in real-time traffic signal control problems.