請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86245完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李家岩 | zh_TW |
| dc.contributor.advisor | Chia-Yen Lee | en |
| dc.contributor.author | 沈宏頴 | zh_TW |
| dc.contributor.author | Hong-Ying Shen | en |
| dc.date.accessioned | 2023-03-19T23:44:30Z | - |
| dc.date.available | 2023-12-27 | - |
| dc.date.copyright | 2022-09-07 | - |
| dc.date.issued | 2022 | - |
| dc.date.submitted | 2002-01-01 | - |
| dc.identifier.citation | References Aghezzaf, E.-H., & Najid, N. M. (2008). Integrated production planning and preventive maintenance in deteriorating production systems. Information Sciences, 178(17), 3382-3392. https://doi.org/10.1016/j.ins.2008.05.007 Alrabghi, A., & Tiwari, A. (2015). State of the art in simulation-based optimisation for maintenance systems. Computers & Industrial Engineering, 82, 167-182. https://doi.org/10.1016/j.cie.2014.12.022 Beume, N., Naujoks, B., & Emmerich, M. (2007). SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research, 181(3), 1653-1669. Brys, T., Harutyunyan, A., Vrancx, P., Nowé, A., & Taylor, M. E. (2017). Multi-objectivization and ensembles of shapings in reinforcement learning. Neurocomputing, 263, 48-59. https://doi.org/10.1016/j.neucom.2017.02.096 Chunming, L., Xin, X., & Dewen, H. (2015). Multiobjective Reinforcement Learning: A Comprehensive Overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3), 385-398. https://doi.org/10.1109/tsmc.2014.2358639 Fitouhi, M.-C., Nourelfath, M., & Gershwin, S. B. (2017). Performance evaluation of a two-machine line with a finite buffer and condition-based maintenance. Reliability Engineering & System Safety, 166, 61-72. Huang, J., Chang, Q., & Arinez, J. (2020). Deep reinforcement learning based preventive maintenance policy for serial production lines. Expert Systems with Applications, 160. https://doi.org/10.1016/j.eswa.2020.113701 Kang, K., & Subramaniam, V. (2018). Integrated control policy of production and preventive maintenance for a deteriorating manufacturing system. Computers & Industrial Engineering, 118, 266-277. Liao, W., Pan, E., & Xi, L. (2009). Preventive maintenance scheduling for repairable system with deterioration. Journal of Intelligent Manufacturing, 21(6), 875-884. https://doi.org/10.1007/s10845-009-0264-z Mobley, R. K. (2002). An introduction to predictive maintenance. Elsevier. Moffaert, K. V., Drugan, M. M., & Nowé, A. (2013). Hypervolume-based multi-objective reinforcement learning. International Conference on Evolutionary Multi-Criterion Optimization, Nguyen, T. T., Nguyen, N. D., Vamplew, P., Nahavandi, S., Dazeley, R., & Lim, C. P. (2020). A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 96. https://doi.org/10.1016/j.engappai.2020.103915 Nourelfath, M., Nahas, N., & Ben-Daya, M. (2016). Integrated preventive maintenance and production decisions for imperfect processes. Reliability Engineering & System Safety, 148, 21-31. https://doi.org/10.1016/j.ress.2015.11.015 Smith, L. N. (2018). A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double q-learning. Proceedings of the AAAI conference on artificial intelligence, Wang, X., Wang, H., & Qi, C. (2014). Multi-agent reinforcement learning based maintenance policy for a resource constrained flow line system. Journal of Intelligent Manufacturing, 27(2), 325-333. https://doi.org/10.1007/s10845-013-0864-5 Xanthopoulos, A. S., Kiatipis, A., Koulouriotis, D. E., & Stieger, S. (2018). Reinforcement Learning-Based and Parametric Production-Maintenance Control Policies for a Deteriorating Manufacturing System. IEEE Access, 6, 576-588. https://doi.org/10.1109/access.2017.2771827 Xanthopoulos, A. S., Koulouriotis, D. E., & Botsaris, P. N. (2015). Single-stage Kanban system with deterioration failures and condition-based preventive maintenance. Reliability Engineering & System Safety, 142, 111-122. https://doi.org/10.1016/j.ress.2015.05.008 Yang, L., Ye, Z.-S., Lee, C.-G., Yang, S.-f., & Peng, R. (2019). A two-phase preventive maintenance policy considering imperfect repair and postponed replacement. European Journal of Operational Research, 274(3), 966-977. Yao, X., Fernandez-Gaucherand, E., Fu, M. C., & Marcus, S. I. (2004). Optimal Preventive Maintenance Scheduling in Semiconductor Manufacturing. IEEE Transactions on Semiconductor Manufacturing, 17(3), 345-356. https://doi.org/10.1109/tsm.2004.831948 Zhao, S., Wang, L., & Zheng, Y. (2014). Integrating production planning and maintenance: an iterative method. Industrial Management & Data Systems, 114(2), 162-182. https://doi.org/10.1108/imds-07-2013-0314 Hung, Y. H., Lee, C. Y., Tsai, C. H., & Lu, Y. M. (2022). Constrained particle swarm optimization for health maintenance in three-mass resonant servo control system with LuGre friction model. Annals of Operations Research, 311(1), 131-150. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86245 | - |
| dc.description.abstract | 製造業中,預防性保養一直以來都是一項重要的議題。保養牽涉到了工廠中許多不同面向的問題,比如說良率、機台可靠性、產能穩定性等問題。如果不重視保養策略的制定,工廠將可能導致良率低落、機台無預警的故障以及上下游間機台的運作不順。而過去在工廠中常使用的方法已經逐漸無法適應複雜且具有隨機性的工廠環境,比方說機台的老化造成的機台可靠性不佳、保養時間的長短的效果不同、機台工作量容易變動等。因此在許多的研究中,如何得到良好的保養策略變成一項重要的研究項目。在本篇研究中,利用模擬的方法來還原工廠中上下游機台的環境,並且利用馬可夫決策過程來轉化問題。基於馬可夫決策過程,我們使用了深度強化學習模型來得到保養策略。除此之外,本篇研究模擬了多種不同的工廠環境以進行模型訓練,並為模型找出可解釋能力,使得我們除了能夠利用模型得到一個良好的保養策略以外,更能夠從中理解模型進行決策的關鍵並從中修正策略。 | zh_TW |
| dc.description.abstract | Preventive maintenance has always been an important issue in manufacturing. Maintenance involves many different aspects of the factory, such as yield, machine reliability, and product stability. If neglecting the importance of maintenance strategies, factories may lead to low yields, machine failures, and unsmooth operation of upstream and downstream machines. However, the methods traditionally used in factories in the past have gradually been unable to adapt to the stochastic factory environment. For example, the deterioration of the machine causes the less reliability of the machine, the effect of the length of maintenance time is different, and the workload of the machine is varying. In this study, we simulate the environment which include upstream machine and downstream machine, and the Markov decision process is used to formulate the problem. Based on the Markov decision process, we use a deep reinforcement learning model to derive the maintenance policy. In addition, we apply this method in a variety of different factory environments. Besides getting a good policy, we can also get an explanation from the model and we can understand the key to the model's decision-making and modify the policy from it. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T23:44:30Z (GMT). No. of bitstreams: 1 U0001-2508202216421800.pdf: 2110869 bytes, checksum: 32103ca089c9f075b036149a9760aece (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | List of Contents Chapter 1. Introduction 1 1.1. Background and Motivation 1 1.2. Research Objective 4 1.3. Research Contribution 5 1.4. Research Plan 5 Chapter 2. Literature Review 6 2.1. Different assumptions and descriptions about the manufacturing system 6 2.2. Different methodologies for the problem 9 2.3. Reinforcement learning 11 Chapter 3. Problem Description and Formulation 13 3.1. System description and assumptions 13 3.2. Problem statement 19 3.3. MDP Formulation for Problem 23 3.4. Methodology 27 Chapter 4. Simulation Study 32 4.1. Parameters Setting and Training Process 32 4.2. Evaluation of The Learned Policy 38 4.3. Overview of Experiment Result 39 4.4. Further Observation 46 Chapter 5. Conclusion and Future Direction 58 5.1. Conclusion 58 5.2. Future Direction 59 References ……………………………………………………………………………………61 List of Tables Table 3.1 System notation 18 Table 3.2 Hyperparameter in Algorithm 1 31 Table 4.1 Environment parameters 33 Table 4.2 Parameters in the simulation case 36 Table 4.3 Model parameters 37 Table 4.4 Experiment results 44 Table 4.5 fixed state and variable 54 List of Figures Figure 3.1 System architecture 17 Figure 3.2 Hypervolume 19 Figure 3.3 MDP Framework 23 Figure 4.1 Convergence plot 38 Figure 4.2 Base case's accumulated reward 40 Figure 4.3 Base case's yield loss 41 Figure 4.4 Base case’s maintenance cost 42 Figure 4.5 Base case's production loss 42 Figure 4.6 Other cases' accumulated reward 45 Figure 4.7 Base case’s action trajectory 47 Figure 4.8 Case 1 action trajectory 48 Figure 4.9 Case 2 action trajectory 49 Figure 4.10 Case 3 action trajectory 50 Figure 4.11 Case 4 action trajectory 50 Figure 4.12 Case 5 action trajectory 51 Figure 4.13 Case 6 action trajectory 51 Figure 4.14 Case 7 action trajectory 52 Figure 4.15 Case 8 action trajectory 52 Figure 4.16 Case 9 action trajectory 53 Figure 4.17 Policy map of base case’s downstream machine when upstream machine age is 10 55 Figure 4.18 Policy map of case 5 downstream machine when upstream machine age is 10 55 Figure 4.19 Policy map of base case 57 Figure 4.20 Policy map of case 5 57 | - |
| dc.language.iso | en | - |
| dc.subject | 深度強化學習 | zh_TW |
| dc.subject | 馬可夫決策過程 | zh_TW |
| dc.subject | 多目標強化學習 | zh_TW |
| dc.subject | 機台可靠性 | zh_TW |
| dc.subject | 預防性保養 | zh_TW |
| dc.subject | 深度強化學習 | zh_TW |
| dc.subject | 馬可夫決策過程 | zh_TW |
| dc.subject | 多目標強化學習 | zh_TW |
| dc.subject | 機台可靠性 | zh_TW |
| dc.subject | 預防性保養 | zh_TW |
| dc.subject | machine reliability | en |
| dc.subject | Preventive maintenance | en |
| dc.subject | Deep reinforcement learning | en |
| dc.subject | Markov decision process | en |
| dc.subject | multi-objective RL | en |
| dc.subject | machine reliability | en |
| dc.subject | Preventive maintenance | en |
| dc.subject | Deep reinforcement learning | en |
| dc.subject | Markov decision process | en |
| dc.subject | multi-objective RL | en |
| dc.title | 基於深度強化學習之預防性保養政策於上下游可修復的耗損機台 | zh_TW |
| dc.title | Deep reinforcement learning-based preventive maintenance for repairable machines with deterioration in flow line system | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 110-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 孔令傑;林怡伶;黃奎隆;吳政鴻 | zh_TW |
| dc.contributor.oralexamcommittee | Ling-Chieh Kung;Yi-Lin Lin;Kwei-Long Huang;Cheng-Hung Wu | en |
| dc.subject.keyword | 預防性保養,深度強化學習,馬可夫決策過程,多目標強化學習,機台可靠性, | zh_TW |
| dc.subject.keyword | Preventive maintenance,Deep reinforcement learning,Markov decision process,multi-objective RL,machine reliability, | en |
| dc.relation.page | 63 | - |
| dc.identifier.doi | 10.6342/NTU202202816 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2022-08-30 | - |
| dc.contributor.author-college | 管理學院 | - |
| dc.contributor.author-dept | 資訊管理學系 | - |
| dc.date.embargo-lift | 2025-08-25 | - |
| 顯示於系所單位: | 資訊管理學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-110-2.pdf | 2.06 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
