Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 機械工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98123
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李綱zh_TW
dc.contributor.advisorKang Lien
dc.contributor.author姚智元zh_TW
dc.contributor.authorChih-Yuan Yaoen
dc.date.accessioned2025-07-29T16:07:54Z-
dc.date.available2025-07-30-
dc.date.copyright2025-07-28-
dc.date.issued2025-
dc.date.submitted2025-07-23-
dc.identifier.citationD. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
B. M. Baker and M. Ayechew. A genetic algorithm for the vehicle routing problem. Computers Operations Research, 30(5):787–800, 2003.
R. Baldacci, A. Mingozzi, and R. Roberti. New route relaxation and pricing strategies for the vehicle routing problem. Operations Research, 59:1269–1283, 09 2011.
G. Barbarosoglu and D. Ozgur. A tabu search algorithm for the vehicle routing problem. Computers Operations Research, 26(3):255–270, 1999.
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.
G. Bono, J. S. Dibangoye, O. Simonin, L. Matignon, and F. Pereyron. Solving multi-agent routing problems using deep attention mechanisms. IEEE Transactions on Intelligent Transportation Systems, 22(12):7804–7813, 2020.
G. Clarke and J. W. Wright. Scheduling of vehicles from a central depot to a number of delivery points. Operations Research, 12(4):568–581, 1964.43
G. B. Dantzig and J. H. Ramser. The truck dispatching problem. Management science, 6:80–91, 1959.
V. Furnon and L. Perron. Or-tools routing library.
B. E. Gillett and L. R. Miller. A heuristic algorithm for the vehicle-dispatch problem. Operations research, 22(2):340–349, 1974.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
A. Hottung and K. Tierney. Neural large neighborhood search for the capacitated vehicle routing problem. In ECAI 2020, pages 443–450. IOS Press, 2020.
Y. Hu, Y. Yao, and W. S. Lee. A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowledge-Based Systems, 204:106244, 2020.
E. Khalil, H. Dai, Y. Zhang, B. Dilkina, and L. Song. Learning combinatorial optimization algorithms over graphs. Advances in neural information processing systems, 30, 2017.
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
W. Kool, H. van Hoof, and M. Welling. Attention, learn to solve routing problems! In International Conference on Learning Representations, 2018.
Y.-D. Kwon, J. Choo, B. Kim, I. Yoon, Y. Gwon, and S. Min. Pomo: Policy optimization with multiple optima for reinforcement learning. Advances in Neural Information Processing Systems, 33:21188–21198, 2020.
G. Laporte. The vehicle routing problem: An overview of exact and approximate algorithms. European Journal of Operational Research, 59:345–358, 1992.
B. Li, G. Wu, Y. He, M. Fan, and W. Pedrycz. An overview and experimental study of learning-based optimization algorithms for the vehicle routing problem. IEEE/CAA Journal of Automatica Sinica, 9(7):1115–1138, 2022.
J. Li, Y. Ma, R. Gao, Z. Cao, A. Lim, W. Song, and J. Zhang. Deep reinforcement learning for solving the heterogeneous capacitated vehicle routing problem. IEEE Transactions on Cybernetics, 52(12):13572–13585, 2022.
I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 4602–4609, 2019.
M. Nazari, A. Oroojlooy, L. Snyder, and M. Takác. Reinforcement learning for solving the vehicle routing problem. Advances in neural information processing systems, 31, 2018.
M. Tang, W. Zhuang, B. Li, H. Liu, Z. Song, and G. Yin. Energy-optimal routing for electric vehicles using deep reinforcement learning with transformer. Applied Energy, 350:121711, 2023.45
P. Toth and D. Vigo. The vehicle routing problem. SIAM, 2002.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
O. Vinyals, M. Fortunato, and N. Jaitly. Pointer networks. Advances in neural information processing systems, 28, 2015.
R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8:229–256, 1992.
A. Wren and A. Holliday. Computer scheduling of vehicles from one or more depots to a number of delivery points. Journal of the Operational Research Society, 23(3):333–344, 1972.
L. Xin, W. Song, Z. Cao, and J. Zhang. Multi-decoder attention model with embedding glimpse for solving vehicle routing problems. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 12042–12049, 2021.
K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka. Representation learning on graphs with jumping knowledge networks. In International conference on machine learning, pages 5453–5462. PMLR, 2018.
Y. Xu, M. Fang, L. Chen, G. Xu, Y. Du, and C. Zhang. Reinforcement learning with multiple relational attention for solving vehicle routing problems. IEEE Transactions on Cybernetics, 52(10):11107–11120, 2021.
C. Zhang, Y. Wu, Y. Ma, W. Song, Z. Le, Z. Cao, and J. Zhang. A review on learning to solve combinatorial optimisation problems in manufacturing. IET Collaborative Intelligent Manufacturing, 5(1):e12072, 2023.
K. Zhang, F. He, Z. Zhang, X. Lin, and M. Li. Multi-vehicle routing problems with soft time windows: A multi-agent reinforcement learning approach. Transportation Research Part C: Emerging Technologies, 121:102861, 2020.
Z. Zhang, G. Qi, and W. Guan. Coordinated multi-agent hierarchical deep reinforcement learning to solve multi-trip vehicle routing problems with soft time windows. IET Intelligent Transport Systems, 17(10):2034–2051, 2023.
李明峰. 深度強化學習於多車路徑規劃: 基於狀態編碼與有序建構的可擴展框架. Master’s thesis, 國立臺灣大學, 臺北市, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98123-
dc.description.abstract本研究以馬可夫決策流程(MDP)建模車輛路徑規劃問題,採用多智能體強化學習(Multi-Agent Reinforcement Learning, MARL)以共享策略與集中獎勵進行模型訓練。為避免模型過度依賴初始環境狀態,於每一決策步驟中提取當前環境特徵作為輸入。在執行階段,於同一時間步中每台車輛並行決策,選擇其下一個目標節點。為處理多車競爭同一節點的問題,引入了遮罩機制以消除衝突動作,並進一步結合改良的訓練基線設計,以提升訓練與推論的效果。
在數值模擬實驗以異質車輛路徑問題 (HVRP) 進行實驗,先採用以 50 節點 8台車的規模進行訓練,最後再混合的 25 節點 4 台車、75 節點 12 台車兩種規模,總共 3 種規模的的資料進行混合批次 (Mix Batch) 訓練。以犧牲性能換取計算的效率,實驗證明在有額外使用混合批次進行訓練的模型可以有更好的性能表現,其中在 25 節點 4 台車的規模中以 3.4% 的性能損失換取減少約 45.5% 的計算時間;在 50 節點 8 台車的規模中以 6.79% 的性能損失換取減少約 54.3% 的計算時間;並在 75 節點 12 台車的規模中以 12.97% 的性能損失換取減少約 61% 的計算時間,並且在規模較小的問題下有更好的性能表現。
zh_TW
dc.description.abstractThis study formulates the vehicle routing problem within a Markov Decision Process (MDP) framework and leverages Multi-Agent Reinforcement Learning (MARL) with a shared policy and centralized rewards for model training. To prevent the model from relying on the initial environment state, the model extracts features at each decision step. During inference, all vehicles perform parallel actions at each time step to select their respective next target nodes. To address potential conflicts arising from multiple vehicles competing for the same node, a masking mechanism is incorporated to suppress invalid actions. Furthermore, an enhanced training baseline is designed to improve training and inference result.
Numerical experiments are conducted on the Heterogeneous Vehicle Routing Problem (HVRP). The model is initially trained on instances with 50 nodes and 8 vehicles, and subsequently refined using mix-batch training across three problem scales: 25 nodes with 4 vehicles, 50 nodes with 8 vehicles, and 75 nodes with 12 vehicles, the experiment shows that the model with mix-batch training has better performance than the model without mix-batch training. By trading off performance for computational efficiency, the approach achieves approximately a 45.5% reduction in computation time with a 3.4% performance loss on the 25-node, 4-vehicle scale; a 54.3% reduction with a 6.79% performance loss on the 50-node, 8-vehicle scale; and a 61% reduction with a 12.97% performance loss on the 75-node, 12-vehicle scale. Moreover, the method demonstrates better performance on smaller-scale problems.time.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-29T16:07:54Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-07-29T16:07:54Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents致謝 i
摘要 ii
Abstract iii
目次 v
圖次 viii
表次 x
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 研究貢獻 3
第二章 文獻回顧 4
2.1 車輛路徑問題 4
2.1.1 定義與數學模型 4
2.1.2 問題解法 6
2.1.3 基於學習的端到端方法 7
2.2 多智能體強化學習 10
2.2.1 強化學習 10
2.2.2 多智能體共享參數下的策略梯度 11
2.3 神經網路結構 13
2.3.1 注意力機制 (Attention Mechanism) 13
2.3.2 指針網路 (Pointer Network) 15
2.3.3 圖神經網路 (Graph Neural Network, GNN) 16
第三章 研究架構與方法 19
3.1 車輛路徑問題建模 19
3.1.1 規劃架構 20
3.1.2 馬可夫決策流程 21
3.1.3 狀態表示與遮罩機制 23
3.2 端到端模型架構 25
3.2.1 圖特徵擷取模塊 26
3.2.2 車隊特徵擷取模塊 28
3.2.3 指針輸出模塊 29
3.3 多智能體強化學習訓練模型 32
3.3.1 RL 獎勵函數與序列解碼 32
3.3.2 RL with Baseline 33
3.3.3 訓練資料生成 35
第四章 實驗結果與分析 37
4.1 實驗配置 37
4.1.1 實驗平台與訓練超參數 37
4.1.2 OR-tools 38
4.2 實驗與結果分析 39
第五章 結論與未來建議 41
5.1 結論 41
5.2 未來建議 41
參考文獻 43
-
dc.language.isozh_TW-
dc.subject車輛路徑問題zh_TW
dc.subject多智能體系統zh_TW
dc.subject強化學習zh_TW
dc.subjectMulti-Agent Systemsen
dc.subjectVehicle Routing Problemen
dc.subjectReinforcement Learningen
dc.title以多智能體強化學習解決異質車輛路徑問題zh_TW
dc.titleSolving Heterogeneous Vehicle Routing Problem by Using Multi-Agent Reinforcement Learningen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee蕭得聖;林峻永zh_TW
dc.contributor.oralexamcommitteeTe-Sheng Hsiao;Chun-Yeon Linen
dc.subject.keyword車輛路徑問題,強化學習,多智能體系統,zh_TW
dc.subject.keywordVehicle Routing Problem,Reinforcement Learning,Multi-Agent Systems,en
dc.relation.page47-
dc.identifier.doi10.6342/NTU202501716-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2025-07-24-
dc.contributor.author-college工學院-
dc.contributor.author-dept機械工程學系-
dc.date.embargo-lift2025-07-30-
顯示於系所單位:機械工程學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
4.69 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved