請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85400完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 許添本(Tien-Pen Hsu) | |
| dc.contributor.author | Yu-Chun Chen | en |
| dc.contributor.author | 陳又均 | zh_TW |
| dc.date.accessioned | 2023-03-19T23:16:09Z | - |
| dc.date.copyright | 2022-07-26 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-07-22 | |
| dc.identifier.citation | [1] 蘇振維、張舜淵、楊幼文、歐陽恬恬(2017)。北宜運輸路廊供需體檢。交通部運輸研究所合作研究計畫(編號:MOTC-IOT-104-PBA036)。交通部運輸研究所。 [2] 交通部(2021)。高速公路行經各收費路段之通行量。取自https://stat.motc.gov.tw/mocdb/stmain.jsp?sys=100。 [3] 交通部高速公路局(2016)。國道5號「宜蘭-頭城北上路肩通行大客車(Bus On Shoulder, BOS)及主線儀控」範圍再延伸之說明資料。 [4] 交通部高速公路局(2010)。交通工程手冊(號誌、交通安全防護設施及照明篇)。取自https://www.freeway.gov.tw/Publish.aspx?cnid=3412&p=16578。 [5] Papageorgiou M, Kotsialos A. 2002. Freeway ramp metering: an overview. IEEE Transactions on Intelligent Transportation Systems 3(4):271–281 [6] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484-489. [7] Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., ... & Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350-354. [8] Farazi, N. P., Ahamed, T., Barua, L., & Zou, B. (2020). Deep Reinforcement Learning and Transportation Research: A Comprehensive Review. arXiv preprint arXiv:2010.06187. [9] Mizuta, A., Roberts, K., Jacobsen, L., Thompson, N., & Colyar, J. (2014). Ramp metering: a proven, cost-effective operational strategy: a primer (No. FHWA-HOP-14-020). United States. Federal Highway Administration. [10] Papageorgiou, M., & Papamichail, I. (2008). Overview of traffic signal operation policies for ramp metering. Transportation Research Record, 2047(1), 28-36. [11] Papageorgiou, M., Hadj-Salem, H., & Blosseville, J. M. (1991). ALINEA: A local feedback control law for on-ramp metering. Transportation research record, 1320(1), 58-67. [12] Wang, Y., Kosmatopoulos, E. B., Papageorgiou, M., & Papamichail, I. (2014). Local ramp metering in the presence of a distant downstream bottleneck: Theoretical analysis and simulation study. IEEE Transactions on Intelligent Transportation Systems, 15(5), 2024-2039. [13] Sun, J., Li, T., Yu, M., & Zhang, H. M. (2018). Exploring the congestion pattern at long-queued tunnel sag and increasing the efficiency by control. IEEE Transactions on Intelligent Transportation Systems, 19(12), 3765-3774. [14] Taylor, C., Meldrum, D., & Jacobson, L. (1998). Fuzzy ramp metering: Design overview and simulation results. Transportation Research Record, 1634(1), 10-18. [15] Yu, X. F., Xu, W. L., Alam, F., Potgieter, J., & Fang, C. F. (2012, November). Genetic fuzzy logic approach to local ramp metering control using microscopic traffic simulation. In 2012 19th International Conference on Mechatronics and Machine Vision in Practice (M2VIP) (pp. 290-297). IEEE. [16] 吳榮顯(2003)。連續路口之適應性基因模糊邏輯號誌控制系統。碩士論文。國立交通大學交通運輸研究所碩士班。 [17] Bellemans, T., De Schutter, B., & De Moor, B. (2004). Anticipative ramp metering control for freeway traffic networks. Available on-line from http://www. mtns2004. be/database/papersubmission/upload/320. pdf. Accessed August. [18] Hegyi, A., De Schutter, B., & Hellendoorn, H. (2005). Model predictive control for optimal coordination of ramp metering and variable speed limits. Transportation Research Part C: Emerging Technologies, 13(3), 185-209. [19] 張鈞凱(2014)。高速公路可變速限聯合匝道儀控最佳化模式。碩士論文。國立台灣大學土木工程學研究所交通組。 [20] Davarynejad, M., Hegyi, A., Vrancken, J., & van den Berg, J. (2011, October). Motorway ramp-metering control with queuing consideration using Q-learning. In 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC) (pp. 1652-1658). IEEE. [21] Jacob, C., & Abdulhai, B. (2010). Machine learning for multi-jurisdictional optimal traffic corridor control. Transportation Research Part A: Policy and Practice, 44(2), 53-64. [22] Rezaee, K., Abdulhai, B., & Abdelgawad, H. (2013). Self-learning adaptive ramp metering: Analysis of design parameters on a test case in Toronto, Canada. Transportation research record, 2396(1), 10-18. [23] Fares, A., & Gomaa, W. (2014, June). Freeway ramp-metering control based on reinforcement learning. In 11th IEEE International Conference on Control & Automation (ICCA) (pp. 1226-1231). IEEE. [24] Zhou, Y., Ozbay, K., Kachroo, P., & Zuo, F. (2020). Ramp Metering for a Distant Downstream Bottleneck Using Reinforcement Learning with Value Function Approximation. Journal of Advanced Transportation, 2020. [25] 楊皓宇(2021)。深度強化學習之高速公路主線與匝道聯合儀控策略—¬以國道5號為例。碩士論文。國立台灣大學土木工程學研究所交通組。 [26] Neudorff, L. G., Randall, J., Reiss, R. A., & Gordon, R. L. (2003). Freeway management and operations handbook (No. FHWA-OP-04-003). United States. Federal Highway Administration. Office of Transportation Management. [27] Jacobson, E. L., & Landsman, J. (1994). Case studies of US freeway-to-freeway ramp and mainline metering and suggested policies for Washington State (No. 1446). [28] Ghiasi, A., Hale, D., Bared, J., Kondyli, A., & Ma, J. (2018, November). A Dynamic Signal Control Approach for Integrated Ramp and Mainline Metering. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC) (pp. 2892-2898). IEEE. [29] Jiang, N., & Tang, S. (2009, April). An Approach of Controlling Both Freeway On-ramp and Mainline. In 2009 International Joint Conference on Artificial Intelligence (pp. 683-686). IEEE. [30] Watkins, C. J. C. H. (1989). Learning from delayed rewards. [31] Rezaee, K. (2014). Decentralized coordinated optimal ramp metering using multi-agent reinforcement learning. University of Toronto (Canada). [32] Lu, C., Huang, J., Deng, L., & Gong, J. (2017). Coordinated ramp metering with equity consideration using reinforcement learning. Journal of Transportation Engineering, Part A: Systems, 143(7), 04017028. [33] Calvo, J. A., & Dusparic, I. (2018, August). Heterogeneous Multi-Agent Deep Reinforcement Learning for Traffic Lights Control. In AICS (pp. 2-13). [34] Foerster, J., Nardelli, N., Farquhar, G., Afouras, T., Torr, P. H., Kohli, P., & Whiteson, S. (2017, July). Stabilising experience replay for deep multi-agent reinforcement learning. In International conference on machine learning (pp. 1146-1155). [35] Lin, Y., McPhee, J., & Azad, N. L. (2020). Comparison of deep reinforcement learning and model predictive control for adaptive cruise control. IEEE Transactions on Intelligent Vehicles, 6(2), 221-231. [36] 交通部高速公路局(2021)。交通管理措施。取自https://www.freeway.gov.tw/Publish.aspx?cnid=183。 [37] 交通部高速公路局(2021)。「交通資料庫」。取自https://tisvcloud.freeway.gov.tw/。 [38] 江宜穎(2013)。高速公路壅塞模擬與主線速率漸變控制模式之研究。碩士論文。國立台灣大學土木工程學研究所交通組。 [39] PTV system Software and Consulting GmbH. (2020). VISSIM-User Manual 2000. Karlsruhe, Germany. [40] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. [41] Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovskl, G., Dabney, W., Horgan, D., Plot, B., Azar, M., & Silver, D. (2017). Rainbow: Combining Improvements in Deep Reinforcement Learning. arXiv preprint arXiv:1710.02298. [42] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. [43] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [44] Van Hasselt, H., Guez, A., & Silver, D. (2016, March). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). [45] Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952. [46] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., & Freitas, N.(2016) Dueling network architectures for deep reinforcement learning. In International conference on machine learning, 2016: PMLR, pp. 1995-2003. [47] De Asis, K., Hernandez-Garcia, F., Holland, G., & Sutton, R. (2018). Multi-Step Reinforcement Learning: A Unifying Algorithm. arXiv preprint arXiv:1703.01327. [48] Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning (pp. 330-337). [49] Matignon, L., Laurent, G. J., & Le Fort-Piat, N. (2012). Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. The Knowledge Engineering Review, 27(1), 1-31. [50] Tettamanti, T. and M. T. Horváth. A practical manual for Vissim COM programming in Matlab. Dept. for Control of Transportation and Vehicle Systems, Budapest University of Technology and Economics. 2015. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85400 | - |
| dc.description.abstract | 國道5號高速公路為聯絡台北及宜蘭的重要道路,然自通車以來,周末及連續假日車潮造成的壅塞情形便持續發生,使得高速公路效率降低也帶來龐大的社會成本,為此高公局也提出許多管理措施,包括主線儀控、匝道儀控、開放路肩大客車專用道、高乘載管制等等。高公局目前依據車流回堵長度及動態查表法決定主線儀控及匝道儀控之儀控率,於車流回堵至主線儀控處時才啟動儀控設施,本研究認為此控制策略並沒有達到及早控制的效果。 近年來強化學習領域蓬勃發展,在訓練技巧及應用領域上都有許多研究被發表,故本研究採用多代理深度強化學習方法結合雙層Q學習、優先經驗回放、競爭網路結構、多步強化學習、分布式強化學習等技巧建構主線及匝道聯合儀控策略,期望能透過多代理間合作控制儀控下游匯流區及雪山隧道之車流狀況,減緩或避免雪山隧道內之壅塞情形。本研究使用VISSIM微觀車流模擬軟體做為模擬平台,建構國道5號部分路網並訓練模型,主線儀控及匝道儀控代理會觀察路網中車輛偵測器位置的流量及速率資料及另一代理的指紋,並即時選出最佳動作執行,獎勵部分本研究選擇以最大化雪山隧道內瓶頸點之流量作為目標。聯合儀控模型在經過450回合訓練後得以收斂,並以訓練後模型與現況情境進行績效評比。 與現況情境相比,頭城匝道出口至雪隧末旅行時間約減少5%,旅行速率則提升約4.3%;路網系統平均速率小客車提升約2.7%,大客車提升約2.37%;雪山隧道內車流情形以及隧道內瓶頸點速率也有所改善。本研究也透過儀控率分析觀察本研究主線代理傾向選擇較大之儀控率,匝道儀控代理則因較頻繁選擇低儀控率使停等車隊長度較現況情境高,可能造成地方道路較嚴重的回堵,另外也嘗試探討模型選擇儀控率之原因,發現模型會依據隧道內瓶頸點速率調整下一時階之儀控率,顯示本研究多代理深度強化學習模型確實在訓練後學習到獎勵與動作間之關聯。最後根據以上分析做出結論並對後續研究提出建議。 | zh_TW |
| dc.description.abstract | Freeway No.5 is an important road connecting Taipei and Yilan. However, traffic congestion continued to occur on weekends and national holidays, reducing the efficiency of the highway and bringing substantial social costs. The Freeway Bureau has proposed some traffic demand management strategies to alleviate this congestion problem, including mainline metering, ramp metering, motorized shoulder lane for buses, and high-occupancy regulation policy. The Freeway Bureau currently determines the metering rate of the mainline metering and the ramp metering based on the length of the traffic jam and the dynamic lookup method and activates the metering facilities only when the traffic jam extends to the mainline metering spot. This study believes that this control strategy does not achieve the goal of early control. In recent years, the field of reinforcement learning has been booming, and many studies of training techniques and applications have been published. This study adopts a multi-agent deep reinforcement learning method combined with double-Q learning, priority experience replay, dueling network, multi-step reinforcement learning, and distributional reinforcement learning technique to construct the mainline and ramp joint metering strategy, hoping to avoid congestion in Xueshan Tunnel. In this study, VISSIM, the micro traffic flow simulation software was used to construct the simulation environment and train the model. Mainline and ramp agents will observe the traffic and velocity data of the vehicle detector locations and the fingerprint of another agent, and instantly select the best action to execute. Choosing the flow of the bottleneck point in the Xueshan Tunnel as the reward. The joint metering model converged after 450 episodes of training, and the performance was compared between the trained model and the current situation. Compared with the current situation, the travel time from the exit of the Toucheng ramp to the end of the Xueshan Tunnel is reduced by about 5%, and the travel speed is increased by about 4.3%; the system average speed is increased by about 2.7% for passenger cars, and increased by about 2.37% for buses; the traffic condition inside Xueshan Tunnel and speed at bottleneck points also improved. This study also observes that the mainline agent tends to choose a higher metering rate, while the ramp agent chooses a lower metering rate more frequently, so the length of the ramp waiting fleet is longer than the current situation. In addition, we also tried to explore the reason why the model select the metering rate. It is found that the model will adjust the metering rate of the next time step according to the speed of the bottleneck point in the tunnel. This shows that the multi-agent deep reinforcement learning model this study proposed does learn the connection between reward and action after training. Finally, based on the analysis above, conclusions are drawn and suggestions for follow-up research are put forward. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T23:16:09Z (GMT). No. of bitstreams: 1 U0001-2107202214213300.pdf: 4123550 bytes, checksum: 54aab819bf7461ef5fcd76799242d059 (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | 誌謝 i 摘要 ii Abstract iv 目錄 vi 圖目錄 ix 表目錄 xi 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機 3 1.3 研究目的 4 1.4 研究範圍 5 1.5 研究流程 6 第二章 文獻回顧 8 2.1 匝道儀控策略及演算法 8 2.1.1 儀控策略(Metering Policy) 8 2.1.2 匝道儀控演算法 12 2.2 主線儀控 21 2.3 聯合儀控 22 2.4 多代理強化學習 22 2.5 小結 24 第三章 國道5號路網建構 28 3.1 資料收集 29 3.2 VISSIM路網建構 32 3.3 路網驗證 34 第四章 研究方法 39 4.1 強化學習(Reinforcement Learning)與Q學習(Q-Learning) 39 4.2 神經網路(Neural Network) 44 4.3 深度Q學習 48 4.3.1 深度Q網路(Deep Q Nework,DQN) 49 4.3.2 雙層Q學習(Double Q-Learning) 51 4.3.3 優先經驗回放(Prioritized Experience Replay) 51 4.3.4 競爭網路(Dueling Network) 52 4.3.5 多步強化學習(Multi-step Reinforcement Learning) 53 4.3.6 分布式強化學習(Distributional RL) 54 4.4 多代理學習之獨立Q網路(Independent DQN,IDQN) 55 4.5 國道5號主線與匝道聯合儀控模型 56 第五章 聯合儀控模型訓練結果及情境分析 62 5.1 模型訓練設定 62 5.2 模型訓練過程 65 5.3 模型訓練結果 67 5.4 績效評比與實際情境比較 68 5.4.1 旅行時間分析 69 5.4.2 延車公里、延車小時、系統平均速率分析 71 5.4.3 雪山隧道內車流分析及瓶頸點分析 74 5.4.4 儀控率與停等車隊長度分析 77 第六章 結論與建議 81 6.1 結論 81 6.2 研究限制 82 6.3 建議 82 參考文獻 84 附錄一 90 附錄二 92 附錄三 93 附錄四 94 | |
| dc.language.iso | zh-TW | |
| dc.subject | 匝道儀控 | zh_TW |
| dc.subject | 多代理深度強化學習 | zh_TW |
| dc.subject | 主線儀控 | zh_TW |
| dc.subject | Mainline metering | en |
| dc.subject | Multi-agent deep reinforcement learning | en |
| dc.subject | Ramp metering | en |
| dc.title | 應用多代理深度強化學習於高速公路主線及匝道聯合儀控策略—以國道5號為例 | zh_TW |
| dc.title | Coordinating Freeway Mainline Metering and Ramp Metering Strategies by Multi-Agent Deep Reinforcement Learning—A Case Study of Freeway No.5 | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 魏健宏(Chien-Hung Wei),胡守任(Shou-Ren Hu) | |
| dc.subject.keyword | 多代理深度強化學習,主線儀控,匝道儀控, | zh_TW |
| dc.subject.keyword | Multi-agent deep reinforcement learning,Mainline metering,Ramp metering, | en |
| dc.relation.page | 94 | |
| dc.identifier.doi | 10.6342/NTU202201605 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2022-07-22 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 土木工程學研究所 | zh_TW |
| dc.date.embargo-lift | 2022-07-26 | - |
| 顯示於系所單位: | 土木工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2107202214213300.pdf | 4.03 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
