請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52565完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 許添本(Tien-Pen Hsu) | |
| dc.contributor.author | Kai-You Cheng | en |
| dc.contributor.author | 程楷祐 | zh_TW |
| dc.date.accessioned | 2021-06-15T16:18:46Z | - |
| dc.date.available | 2023-08-15 | |
| dc.date.copyright | 2020-08-21 | |
| dc.date.issued | 2020 | |
| dc.date.submitted | 2020-08-06 | |
| dc.identifier.citation | [1] 交通部統計網,民國109年。檢索自 https://stat.motc.gov.tw/mocdb/stmain.jsp?sys=100. [2] 許勝翔(2011)。混合車流之過飽和路段號誌最佳化模式研究。碩士論文。國立台灣大學土木所交通組。 [3] Webster, F. V. (1958). Traffic signal settings. [4] 林良泰、李建昌、許乃文(2001)。延滯最小化之幹道號誌時制設計研究。國際道路交通安全與執法研討會。 [5] Dotoli, M., Fanti, M. P., Meloni, C. (2003). Real time optimization of traffic signal control: application to coordinated intersections. Paper presented at the SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme-System Security and Assurance (Cat. No. 03CH37483). [6] Bretherton, R., Bowen, G. (1991). Incident detection and traffic monitoring in urban areas. Paper presented at the Proc., DRIVE Conference. [7] Henry, J., Farges, J. (1989). PRODYN, CCCT'89: Control. Computers. Paper presented at the Communications in Transportation. 6th IFAC/IFIP/IFORS Symposium on Transportation, Paris, France, September. [8] Gartner, N. H. (1983). OPAC: A demand-responsive strategy for traffic signal control. [9] Sims, A., Finlay, A. (1984). SCATS, splits and offsets simplified (SOS). Australian Road Research, 12(4). [10] Sutton, R. S., Barto, A. G. (2018). Reinforcement learning: An introduction: MIT press. [11] Thorpe, T. L., Anderson, C. W. (1996). Tra c light control using sarsa with three state representations: Citeseer. [12] Wiering, M. (2000). Multi-agent reinforcement learning for traffic light control. Paper presented at the Machine Learning: Proceedings of the Seventeenth International Conference (ICML'2000). [13] Abdulhai, B., Pringle, R., Karakoulas, G. J. (2003). Reinforcement learning for true adaptive traffic signal control. Journal of Transportation Engineering, 129(3), 278-285. [14] Houli, D., Zhiheng, L., Yi, Z. (2010). Multiobjective reinforcement learning for traffic signal control using vehicular ad hoc network. EURASIP journal on advances in signal processing, 2010(1), 724035. [15] Brys, T., Pham, T. T., Taylor, M. E. (2014). Distributed learning and multi-objectivity in traffic light control. Connection Science, 26(1), 65-83. [16] Mannion, P., Duggan, J., Howley, E. (2016). An experimental review of reinforcement learning algorithms for adaptive traffic signal control Autonomic road transport support systems (pp. 47-66): Springer. [17] Li, L., Lv, Y., Wang, F.-Y. (2016). Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica, 3(3), 247-254. [18] Van der Pol, E., Oliehoek, F. A. (2016). Coordinated deep reinforcement learners for traffic light control. Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016). [19] Wan, C.-H., Hwang, M.-C. (2019). Adaptive Traffic Signal Control Methods Based on Deep Reinforcement Learning Intelligent Transport Systems for Everyone’s Mobility (pp. 195-209): Springer. [20] Genders, W., Razavi, S. (2016). Using a deep reinforcement learning agent for traffic signal control. arXiv preprint arXiv:1611.01142. [21] Zheng, G., Xiong, Y., Zang, X., Feng, J., Wei, H., Zhang, H., . . . Li, Z. (2019). Learning phase competition for traffic signal control. Paper presented at the Proceedings of the 28th ACM International Conference on Information and Knowledge Management. [22] Liang, X., Du, X., Wang, G., Han, Z. (2018). Deep reinforcement learning for traffic light control in vehicular networks. arXiv preprint arXiv:1803.11115. [23] Fritzsche, H.-T. (1994). A model for traffic simulation. Traffic Engineering+ Control, 35(5), 317-321. [24] Krajzewicz, D., Hertkorn, G., Rössel, C., Wagner, P. (2002). SUMO (Simulation of Urban MObility)-an open-source traffic simulation. Paper presented at the Proceedings of the 4th middle East Symposium on Simulation and Modelling (MESM20002). [25] Krauß, S. (1998). Microscopic modeling of traffic flow: Investigation of collision free vehicle dynamics. [26] Wiedemann, R. (1974). Simulation des Straßenverkehrsflusses', Schriftenreihe des Instituts für Verkehrswesen der Universitä Karlsruhe, Heft 8. [27] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. [28] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. [29] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., . . . Ostrovski, G. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [30] Van Hasselt, H., Guez, A., Silver, D. (2016). Deep reinforcement learning with double q-learning. Paper presented at the Thirtieth AAAI conference on artificial intelligence. [31] Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., De Freitas, N. (2015). Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581. [32] Schaul, T., Quan, J., Antonoglou, I., Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952. [33] Tettamanti, T., Horváth, M. T. (2015). A practical manual for Vissim COM programming in Matlab. Dept. for Control of Transportation and Vehicle Systems, Budapest University of Technology and Economics. [34] 台北市政府研究發展成果網,民國107年發布。檢索自 https://rdnet.taipei.gov.tw/xDCM/TPE_user/index.jsp. [35] 交通部(2011)。臺灣公路容量手冊,交通部運輸研究所。 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52565 | - |
| dc.description.abstract | 台灣之號誌化路口大多以定時號誌為主,但定時號誌可能無法及時反應車流狀況,因此則有應變車流狀態之適應性號誌系統出現,然而在傳統適應性號誌中,許多交通特性之參數方面須由人工假設或校估,進而可能產生人工判斷之誤差。 基於人工智慧之技術逐漸進步,本研究試圖以人工智慧之方式應用於台灣混合車流之號誌化路口,採用人工智慧建構適應性號誌之優勢在於可以減少對於交通特性之參數假設,故在本研究中將採用深度強化學習方式建構出號誌代理幫助適應性號誌進行決策,號誌代理會觀察路口車流環境狀態並做出對應該狀態之最佳動作,本研究首先使用車流模擬軟體VISSIM建立符合台灣車流特性之高混合比號誌化路口車流環境以作為號誌代理學習之環境,並假設號誌代理可觀察路口所有狀態,偵測之狀態特徵包括車輛位置、速率以及當前時相狀態,其中車輛位置及速率是採用格位狀態作為輸入,故本研究以卷積神經網路為基礎預測各動作之價值;而為了使號誌代理學習最佳動作之決策,當號誌代理做出動作之後必須給予其獎勵或懲罰以讓號誌代理往正確之方向學習,本研究採用主要由車輛之停等時間以及路口通過量所組成之加權指標給予號誌代理獎勵或處罰,且另外考量切換時相所產生之紓解時間損失懲罰。最終結果顯示,本研究所建立之適應性號誌在績效表現上皆較固定時制佳,且經過不同情境之測試,驗證本研究之適應性號誌確實具有應變車流變化之能力。 | zh_TW |
| dc.description.abstract | In Taiwan, most of signalized intersection use pretimed control, but pretimed control may not response to traffic conditions in time. Therefore, an adaptive signal control system appears. For traditional adaptive signal control, many parameters of traffic features must be assumed and calibrated artificially. However, these parameters may have errors in artificial assumptions. The study aims to apply artificial intelligence to the signalized intersection of mixed traffic in Taiwan because the advantage of using artificial intelligence to develop an adaptive signal control system is reducing parameter assumptions on traffic features. Hence, the study will develop an agent by deep reinforcement learning which help adaptive signal control system make decisions. In the study, the traffic simulation software VISSIM is used to create virtual environment with high mix rate of motorcycle, and the agent can learn itself on the environment. Assumes that the agent can observe all the state of the environment, the features of state include vehicle position, speed and current phase, and use convolutional neural networks to predict the actions’ value on the specific state. To make the agent learn the best action policy, agent must be rewarded or punished after performing the action because gent’s goal is to maximize a long-term reward. The study defines the reward as weighted sum reward function which contains vehicle’s waiting time, throughput and penalty for changing to the other phase. Finally, the study shows that the adaptive signal control system developed by deep reinforcement learning is better than the fixed time control on the traffic performance. After testing in different scenarios, the study also verifies that the adaptive signal control system can adapt the changes in traffic flow. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T16:18:46Z (GMT). No. of bitstreams: 1 U0001-0608202015261700.pdf: 4227482 bytes, checksum: 9c37fcb8a2668387f6bce1bb9bdba92e (MD5) Previous issue date: 2020 | en |
| dc.description.tableofcontents | 摘要 I Abstract II 目錄 III 圖目錄 V 表目錄 VII 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機 2 1.3 研究目的 3 1.4 研究內容與範疇 4 1.5 研究方法 5 1.6 研究流程 5 第二章 文獻回顧 7 2.1 傳統號誌控制設計考量 7 2.2 強化學習號誌控制系統 11 2.3 微觀車流模擬軟體 17 2.4 小結 19 第三章 方法論 20 3.1 研究架構 20 3.2 深度強化學習演算法 21 3.3 模式建構 35 3.4 訓練模擬流程 49 第四章 深度強化學習模式模擬結果分析 53 4.1 訓練過程 53 4.2 模擬結果分析 63 4.3 小結 74 第五章 模式之情境應用分析 75 5.1 流量情境分析 75 5.2 左轉流向情境分析 78 5.3 混合比情境分析 82 5.4 小結 84 第六章 結論與建議 85 6.1 結論 85 6.2 建議 86 參考文獻 88 | |
| dc.language.iso | zh-TW | |
| dc.subject | 混合車流 | zh_TW |
| dc.subject | 深度強化學習 | zh_TW |
| dc.subject | 適應性號誌 | zh_TW |
| dc.subject | Adaptive signal control | en |
| dc.subject | Deep reinforcement learning | en |
| dc.subject | Mixed traffic | en |
| dc.title | 以深度強化學習方式建構混合車流之適應性號誌控制系統 | zh_TW |
| dc.title | Developing Adaptive Signal Control System under Mixed Traffic Flow by Deep Reinforcement Learning | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 108-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 胡守任(Shou-Ren Hu),陳柏華(Albert Chen) | |
| dc.subject.keyword | 適應性號誌,深度強化學習,混合車流, | zh_TW |
| dc.subject.keyword | Adaptive signal control,Deep reinforcement learning,Mixed traffic, | en |
| dc.relation.page | 91 | |
| dc.identifier.doi | 10.6342/NTU202002549 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2020-08-06 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 土木工程學研究所 | zh_TW |
| 顯示於系所單位: | 土木工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0608202015261700.pdf 未授權公開取用 | 4.13 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
