Simulated annealing based on deep reinforcement learning for solving the two-echelon vehicle routing problem

XU Haixin; HU Rong; QIAN Bin

doi:10.7540/j.ynu.20240263

XU Haixin, HU Rong, QIAN Bin. Simulated annealing based on deep reinforcement learning for solving the two-echelon vehicle routing problem[J]. Journal of Yunnan University: Natural Sciences Edition. DOI: 10.7540/j.ynu.20240263

Citation:

Simulated annealing based on deep reinforcement learning for solving the two-echelon vehicle routing problem

Graphical Abstract

Graphical Abstract

Abstract

Abstract

A simulated annealing based on deep reinforcement learning (SADRL) is proposed to solve the widely prevalent two-echelon vehicle routing problem (2E-VRP) in practical logistics, with the optimization objective of minimizing the total route length. Since 2E-VRP consists of two coupled sub-stages, i.e., customer-satellite allocation stage and delivery route planning stage. Different customer-satellite allocation schemes will affect the optimization of subsequent delivery routes, so the solution space of 2E-VRP will be extensive and complex. According to this characteristic, in SADRL, firstly, a key-value encoding and decoding scheme is designed for the customer-satellite allocation problem, and the simulated annealing (SA) algorithm is used to solve the customer-satellite allocation problem. 2E-VRP can be decomposed into multiple VRP subproblems. Secondly, based on the decomposition scheme, the attention model-VRP (AM-VRP) trained by reinforcement learning is used to obtain the high-quality delivery route of VRP, which can quickly evaluate the quality of the decomposition scheme, reduce the complexity of the problem, and guide the algorithm to efficiently explore high-quality solution areas in the complex solution space. Finally, for the decomposed multiple VRP subproblems, a variable neighborhood descent with destruction/reconstruction operations (VND-DRO) algorithm was designed to further optimize their delivery routes, in order to achieve in-depth and detailed search of high-quality solution spaces and discover deep high-quality solutions in complex solution spaces. Experimental verification on datasets of different scales confirms the effectiveness of the proposed SADRL in solving 2E-VRP.

FullText(HTML)

References (32)

Cited By

Turn off MathJax

Article Contents

Simulated annealing based on deep reinforcement learning for solving the two-echelon vehicle routing problem

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content