Particle Swarm Based Reinforcement Learning for Path Planning and Traffic Congestion

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
University of Alabama Libraries

In 2019, the average American commuter wasted approximately two and a half days due to traffic delays. Researchers suggest that these delays could be relieved by the addition of intelligent transportation systems, such as navigational systems that identify multiple high-speed travel routes or sophisticated traffic signals that can adapt to different traffic patterns. This dissertation explores the hybridization of the swarm intelligence algorithm, particle swarm optimization, with the reinforcement learning algorithm, Q-learning, and the hierarchical reinforcement learning algorithm,MAX-Q, to produce an intelligent path-planning algorithm and an adaptive traffic control system. By combining these algorithms with particle swarm optimization, the search space of a single agent is reduced through the parallelization and collaboration of multiple agents. Alternatively, the use of a look-up table improves the performance of particle swarm optimization by enhancing the swarm's ability to learn and balance the local and global search. In order to further improve the performance of the hybrid algorithms, a local particle swarm optimization variant was incorporated into the algorithms' action selection policies. This combination results in two hybrid intelligent optimization algorithms, Q-learning with Local Particle Swarm Optimization and MAXQ with Particle Swarm Optimization. When tasked with path planning in the Taxi World environment, QLPSO and MAXQPSO collectively learned the optimal policy in 46.44% fewer episodes than state-of-the-art algorithms and completed the task in 25.57% fewer steps. Given the success of the novel methods in the path planning problem, the two algorithms were slightly modified to identify the optimal policies for the traffic control problem. For various traffic networks, the algorithms collectively minimized the total wait time by an average of 16.31% and decreased the average wait time per vehicle by 11.43%. The combination of PSO and the learning algorithms demonstrate notable benefits as intelligent transportation systems.

Electronic Thesis or Dissertation