Deep Reinforcement Learning for Crowdsourced Urban Delivery

Ahamed, Tanvir; Zou, Bo; Farazi, Nahid Parvez; Tulabandhula, Theja

Deep Reinforcement Learning for Crowdsourced Urban Delivery

Tanvir Ahamed, Bo Zou, Nahid Parvez Farazi and Theja Tulabandhula

Transportation Research Part B: Methodological, 2021, vol. 152, issue C, 227-257

Abstract: This paper investigates the problem of assigning shipping requests to ad hoc couriers in the context of crowdsourced urban delivery. The shipping requests are spatially distributed each with a limited time window between the earliest time for pickup and latest time for delivery. The ad hoc couriers, termed crowdsourcees, also have limited time availability and carrying capacity. We propose a new deep reinforcement learning (DRL)-based approach to tackling this assignment problem. A deep Q network (DQN) algorithm is trained which entails two salient features of experience replay and target network that enhance the efficiency, convergence, and stability of DRL training. More importantly, this paper makes three methodological contributions: 1) presenting a comprehensive and novel characterization of crowdshipping system states that encompasses spatial-temporal and capacity information of crowdsourcees and requests; 2) embedding heuristics that leverage information offered by the state representation and are based on intuitive reasonings to guide specific actions to take, to preserve tractability and enhance efficiency of training; and 3) integrating rule-interposing to prevent repeated visiting of the same routes and node sequences during routing improvement, thereby further enhancing the training efficiency by accelerating learning. The computational complexities of the heuristics and the overall DQN training are investigated. The effectiveness of the proposed approach is demonstrated through extensive numerical analysis. The results show the benefits brought by the heuristics-guided action choice, rule-interposing, and having time-related information in the state space in DRL training, the near-optimality of the solutions obtained, and the superiority of the proposed approach over existing methods in terms of solution quality, computation time, and scalability.

Keywords: Crowdshipping; deep reinforcement learning; deep Q network; pickup and delivery; state representation; heuristics-guided action choice; rule-interposing (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (13)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0191261521001636
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:transb:v:152:y:2021:i:c:p:227-257

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

DOI: 10.1016/j.trb.2021.08.015

Access Statistics for this article

Transportation Research Part B: Methodological is currently edited by Fred Mannering

More articles in Transportation Research Part B: Methodological from Elsevier
Bibliographic data for series maintained by Catherine Liu ().