動態派工與預保養之強化學習方法初探

Feng-Yun Chung; 鍾逢耘

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/76906

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	吳政鴻(Cheng-Hung Wu)
dc.contributor.author	Feng-Yun Chung	en
dc.contributor.author	鍾逢耘	zh_TW
dc.date.accessioned	2021-07-10T21:40:06Z	-
dc.date.available	2021-07-10T21:40:06Z	-
dc.date.copyright	2020-08-28
dc.date.issued	2020
dc.date.submitted	2020-08-11
dc.identifier.citation	Aissani, N., Beldjilali, B., Trentesaux, D. (2008). Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents. Paper presented at the MOSIM’08: 7th Conference Internationale de Modelisation et Simulation. Baras, J., Ma, D.-J., Makowski, A. (1985). K competing queues with geometric service requirements and linear costs: The μc-rule is always optimal. Systems control letters, 6(3), 173-180. Brandolese, A., Alessandro Brun, and Alberto Portioli-Staudacher. (2000). A multi-agent approach for the capacity allocation problem. International Journal of Production Economics, 66.3, 269-285. Chiang, T.-C., Fu, L.-C. (2007). Using dispatching rules for job shop scheduling with due date-based objectives. International Journal of Production Research, 45(14), 3245-3262. Das, S. R., Canel, C. (2005). An algorithm for scheduling batches of parts in a multi-cell flexible manufacturing system. International Journal of Production Economics, 97(3), 247-262. DiBiano, R., Mukhopadhyay, S. (2017). Automated diagnostics for manufacturing machinery based on well-regularized deep neural networks. Integration, 58, 303-310. Fang, W., Guo, Y., Liao, W., Ramani, K., Huang, S. (2019). Big data driven jobs remaining time prediction in discrete manufacturing system: a deep learning-based approach. International Journal of Production Research, 58(9), 2751-2766. doi:10.1080/00207543.2019.1602744 Hargreaves, J. J., Hobbs, B. F. (2012). Commitment and dispatch with uncertain wind generation by dynamic programming. IEEE Transactions on sustainable energy, 3(4), 724-734. Hirashima, Y., Takeda, K., Harada, S., Deng, M., Inoue, A. (2006). A Q-learning for group-based plan of container transfer scheduling. JSME International Journal Series C Mechanical Systems, Machine Elements and Manufacturing, 49(2), 473-479. Iqbal, R., Maniak, T., Doctor, F., Karyotis, C. (2019). Fault Detection and Isolation in Industrial Processes Using Deep Learning Approaches. IEEE Transactions on Industrial Informatics, 15(5), 3077-3084. doi:10.1109/tii.2019.2902274 Kebarighotbi, A., Cassandras, C. G. (2011). Optimal scheduling of parallel queues using stochastic flow models. Discrete Event Dynamic Systems, 21(4), 547. Kim, J.-S., Jang, S.-J., Kim, T.-W., Lee, H.-J., Lee, J.-B. (2018). A productivity-oriented wafer map optimization using yield model based on machine learning. IEEE Transactions on Semiconductor Manufacturing, 32(1), 39-47. Kim, J.-S., Jang, S.-J., Kim, T.-W., Lee, H.-J., Lee, J.-B. (2019). A Productivity-Oriented Wafer Map Optimization Using Yield Model Based on Machine Learning. IEEE Transactions on Semiconductor Manufacturing, 32(1), 39-47. doi:10.1109/tsm.2018.2870253 Krothapalli, N. K. C., Deshmukh, A. V. (1999). Design of negotiation protocols for multi-agent manufacturing systems. International Journal of Production Research, 37(7), 1601-1624. Kumar, N. H., Srinivasan, G. (1996). A genetic algorithm for job shop scheduling—a case study. Computers in Industry, 31(2), 155-160. Liu, X., Lu, S., Pei, J., Pardalos, P. M. (2018). A hybrid VNS-HS algorithm for a supply chain scheduling problem with deteriorating jobs. International Journal of Production Research, 56(17), 5758-5775. Mahapatra, S., Dash, R. R., Pradhan, S. K. (2016). A heuristic for scheduling of uniform parallel processors. Paper presented at the 2016 2nd International Conference on Computational Intelligence and Networks (CINE). Mahesh, M., Ong, S.-K., Nee, A. Y. (2007). A web-based multi-agent system for distributed digital manufacturing. International Journal of Computer Integrated Manufacturing, 20(1), 11-27. Mouelhi-Chibani, W., Pierreval, H. (2010). Training a neural network to select dispatching rules in real time. Computers Industrial Engineering, 58(2), 249-256. doi:10.1016/j.cie.2009.03.008 Nguyen, S., Zhang, M., Johnston, M., Tan, K. C. (2012). A computational study of representations in genetic programming to evolve dispatching rules for the job shop scheduling problem. IEEE Transactions on Evolutionary Computation, 17(5), 621-639. Panait, L., and Sean Luke. (2005). Cooperative multi-agent learning: The state of the art. Autonomous agents and multi-agent systems 11.3, 387-434. Pfrommer, J., et al. (2018). Optimisation of manufacturing process parameters using deep neural networks as surrogate models. Procedia CIRP, 72, 426-431. Raghu, T., Rajendran, C. (1993). An efficient dynamic dispatching rule for scheduling in a job shop. International Journal of Production Economics, 32(3), 301-313. Schmid, V. (2012). Solving the dynamic ambulance relocation and dispatching problem using approximate dynamic programming. European Journal of Operational Research, 219(3), 611-621. Shahrabi, J., Adibi, M. A., Mahootchi, M. (2017). A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Computers Industrial Engineering, 110, 75-82. Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., . . . Lanctot, M. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484-489. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., . . . Graepel, T. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140-1144. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., . . . Bolton, A. (2017). Mastering the game of go without human knowledge. nature, 550(7676), 354-359. Travers, D. L., Kaye, R. J. (1998). Dynamic dispatch by constructive dynamic programming. IEEE Transactions on Power Systems, 13(1), 72-78. Valckenaers, P., Verstraete, P., Saint Germain, B., Van Brussel, H. (2005). A study of system nervousness in multi-agent manufacturing control system. Paper presented at the International Workshop on Engineering Self-Organising Applications. Veatch, M. H. (2010). A cμ rule for parallel servers with two-tiered cμ preferences. Watkins, C. J., Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292. Wong, T., Leung, C., Mak, K.-L., Fung, R. Y. (2006). Dynamic shopfloor scheduling in multi-agent manufacturing systems. Expert Systems with Applications, 31(3), 486-494. Xiang, W., Lee, H. P. (2008). Ant colony intelligence in multi-agent dynamic manufacturing scheduling. Engineering Applications of Artificial Intelligence, 21(1), 73-85. Zhang, Z., Zheng, L., Hou, F., Li, N. (2011). Semiconductor final test scheduling with Sarsa (λ, k) algorithm. European Journal of Operational Research, 215(2), 446-458. Zhang, Z., Zheng, L., Weng, M. X. (2007). Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning. The International Journal of Advanced Manufacturing Technology, 34(9-10), 968-980. Zhu, Z., Ng, K. M., Ong, H. (2010). A modified tabu search algorithm for cost-based job shop problem. Journal of the Operational Research Society, 61(4), 611-619. 李冠臻. (2020). 大型生產系統之多智能體動態派工與預保養架構. 臺灣大學工業工程學研究所學位論文, 1-60. 姚怡均. (2017). 考量機台損耗之非等效動態生產系統派工與保養. 臺灣大學工業工程學研究所學位論文, 1-106. 孫巧儒. (2018). 具擴充性之多機台動態派工與預防保養方法. 臺灣大學工業工程學研究所學位論文, 1-145. 蔡沂芯. (2019). 應用多智能體分解與合成之動態派工與保養方法. 臺灣大學工業工程學研究所學位論文, 1-176.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/76906	-
dc.description.abstract	本研究提出一適用於大型生產系統多智能體系統(Multi-Agent System, MAS)，以分散式求解方式對大型智慧生產系統求解的概念，此多智能體系統中包含了分解代理人(Decomposition Agent), 機器代理人(Machine Agent), 協作代理人(Negotiation Agent) 及合作代理人(Coordination Agent)四種不同的代理人，並利用此拆分、求解及合併的思維得到大型生產系統的較佳初始解，進而透過強化學習方法優化生產系統中的決策。而在多智能體概念中，本研究主要為提升機器代理人(Machine Agent)在求解過程中的效率及初探強化學習在生產系統求解上的應用。故本研究分為兩個部分，第一部分為介紹如何利用深層神經網路及多項式回歸方法預測兩產品單機台的較佳初始解，接著針對此較佳初始解利用動態規劃繼續求解至收歛，並在研究中展現此一方法更為單純動態規劃求解方法節省求解時間，且與之求解出之解無異。第二部份則為生產系統之強化學習初探。本研究開發出一用於兩產品單機台生產系統的強化學習演算法，其概念為利用強化學習代理人(Agent)與環境(Environment)互動此一特性研發演算法，並直接於演算法內設計類離散事件模擬動作的過程，以實現強化學習中環境這一環節。最後，將在強化學習過程中的決策與動態規劃求解的決策透過模擬比較其平均總完工時間(Average Cycle Time)，結果顯示本研究提出之強化學習演算法能有效求解兩產品單機台生產系統的動態派工與預防保養決策。	zh_TW
dc.description.abstract	This research proposes a multi-agent system that uses a distributed approach to solve a large-scale manufacturing system. This MAS includes four different agents: decomposition agent, machine agent, negotiation agent, and coordination agent. The decomposed systems are solved and then combined to get a better initial solution. Then, a reinforcement learning algorithm is used as an online learning method to further improve the solution of the large-scale manufacturing system. The main objective of this research is to improve the solving time of the machine agent in the MAS as well as to do an2 exploratory study of using reinforcement learning in a manufacturing system. There are two parts in this research. First, we use deep neural network and polynomial regression to predict a better initial solution of a two products one machine system. Then, we use this as initial solution to get an optimal solution using dynamic programming. We compare in terms of solving time and the policy. Second, this research proposes a reinforcement learning algorithm for the two products one machine system. In this algorithm, we design a similar simulation method to implement the environment of a reinforcement learning system. We use the solution found by the method in part one as an initial solution to this algorithm. The result of the study shows that our reinforcement learning algorithm can be used in solving a manufacturing system.	en
dc.description.provenance	Made available in DSpace on 2021-07-10T21:40:06Z (GMT). No. of bitstreams: 1 U0001-1008202015012100.pdf: 4331560 bytes, checksum: 68be205251f98b2e32fbd7f5e0bcca8a (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	致謝 II 中文摘要 III ABSTRACT IV 目錄 V 圖目錄 VIII 表目錄 X 第1章緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 1.3 研究方法 5 1.4 研究流程 5 第2章文獻回顧 7 2.1 平行機台排程問題與派工法則 7 2.1.1 靜態最佳化的派工方法 7 2.1.2 派工法則(Dispatching Rule) 8 2.1.3 動態派工方法 8 2.2 Multi-agent 架構在製造系統上的應用 9 2.3 深層學習在生產管理上的應用 10 2.4 強化學習在生產管理上的應用 11 2.5 文獻探討小節 12 第3章兩產品單機台動態派工與預防保樣模型 13 3.1 過去生產管理方法及侷限 13 3.2 Multi-agent 架構概述及其優勢 13 3.3 兩產品單機台動態派工與預防保養模型 14 3.3.1 問題描述與假設 15 3.3.2 參數與變數符號定義 16 3.3.3 兩產品單機台動態派工與保養模型 17 3.3.4 利用DNN預測兩產品單機台系統之決策表 19 3.4 本章小節 19 第4章較佳初始解求解動態規劃 21 4.1 問題描述及方法 21 4.2 兩產品單機台參數資料集 22 4.2.1 實驗設計 22 4.2.2 線性規劃求最大產能 23 4.2.3 機台利用率解釋及產品需求率 25 4.3 利用深度學習方法預測生產期望成本標準化值 26 4.3.1 建構神經網路資料集 26 4.3.2 DNN模型建構及超參數調整 28 4.4 Polynomial Regression 預測\mathbit{vmax}-\mathbit{vmin}值 30 4.4.1 多項式回歸介紹 30 4.4.2 建構\mathbit{vmax}-\mathbit{vmin}資料集 30 4.4.3 訓練多項式回歸模型 31 4.5 較佳初始解求解動態規劃 32 4.6 結果比較 33 4.6.1 實驗設計 33 4.6.2 policy結果比較 33 4.6.3 收斂時間比較 35 4.7 第四章小結 35 第5章生產系統之強化學習初探 36 5.1 強化學習介紹 36 5.2 強化學習於本研究的概念 38 5.3 強化學習演算法介紹 39 5.3.1 兩產品單機台強化學習參數設定 39 5.3.2 每期如何模擬 40 5.3.3 更新預期生產成本方法 42 5.3.4 演算法 42 5.4 強化學習實作 48 5.4.1 強化學習程式實作介紹 48 5.4.2 模擬程式介紹 48 5.5 強化學習結果與探討 49 5.5.1 強化學習成果與探討 49 5.6 較佳初始解求解比較 56 5.6.1 初始解求解與程式介紹 56 5.6.2 成果比較與探討 56 5.7 強化學習小結 57 第6章結論與未來研究方向 58 6.1 結論 58 6.2 未來研究方向 58 參考文獻 60
dc.language.iso	zh-TW
dc.title	動態派工與預保養之強化學習方法初探	zh_TW
dc.title	An Exploratory Study of Reinforcement Learning in Dynamic Dispatching and Preventive Maintenance	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	洪一薰,陳文智
dc.subject.keyword	動態派工,預防保養,多智能體系統,深度神經網路,強化學習,	zh_TW
dc.subject.keyword	Dynamic Dispatching,Preventive Maintenance,Multi-Agent System,Deep Neuron Network,Reinforcement Learning,	en
dc.relation.page	63
dc.identifier.doi	10.6342/NTU202002810
dc.rights.note	未授權
dc.date.accepted	2020-08-12
dc.contributor.author-college	工學院	zh_TW
dc.contributor.author-dept	工業工程學研究所	zh_TW
顯示於系所單位：	工業工程學研究所

文件中的檔案：

檔案	大小	格式
U0001-1008202015012100.pdf 目前未授權公開取用	4.23 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。