Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86245
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李家岩zh_TW
dc.contributor.advisorChia-Yen Leeen
dc.contributor.author沈宏頴zh_TW
dc.contributor.authorHong-Ying Shenen
dc.date.accessioned2023-03-19T23:44:30Z-
dc.date.available2023-12-27-
dc.date.copyright2022-09-07-
dc.date.issued2022-
dc.date.submitted2002-01-01-
dc.identifier.citationReferences

Aghezzaf, E.-H., & Najid, N. M. (2008). Integrated production planning and preventive maintenance in deteriorating production systems. Information Sciences, 178(17), 3382-3392. https://doi.org/10.1016/j.ins.2008.05.007

Alrabghi, A., & Tiwari, A. (2015). State of the art in simulation-based optimisation for maintenance systems. Computers & Industrial Engineering, 82,
167-182. https://doi.org/10.1016/j.cie.2014.12.022

Beume, N., Naujoks, B., & Emmerich, M. (2007). SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research, 181(3), 1653-1669.

Brys, T., Harutyunyan, A., Vrancx, P., Nowé, A., & Taylor, M. E. (2017). Multi-objectivization and ensembles of shapings in reinforcement learning. Neurocomputing, 263, 48-59. https://doi.org/10.1016/j.neucom.2017.02.096

Chunming, L., Xin, X., & Dewen, H. (2015). Multiobjective Reinforcement Learning: A Comprehensive Overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3), 385-398. https://doi.org/10.1109/tsmc.2014.2358639

Fitouhi, M.-C., Nourelfath, M., & Gershwin, S. B. (2017). Performance evaluation of a two-machine line with a finite buffer and condition-based maintenance. Reliability Engineering & System Safety, 166, 61-72.

Huang, J., Chang, Q., & Arinez, J. (2020). Deep reinforcement learning based preventive maintenance policy for serial production lines. Expert Systems with Applications, 160. https://doi.org/10.1016/j.eswa.2020.113701

Kang, K., & Subramaniam, V. (2018). Integrated control policy of production and preventive maintenance for a deteriorating manufacturing system. Computers & Industrial Engineering, 118, 266-277.

Liao, W., Pan, E., & Xi, L. (2009). Preventive maintenance scheduling for repairable system with deterioration. Journal of Intelligent Manufacturing, 21(6), 875-884. https://doi.org/10.1007/s10845-009-0264-z

Mobley, R. K. (2002). An introduction to predictive maintenance. Elsevier.

Moffaert, K. V., Drugan, M. M., & Nowé, A. (2013). Hypervolume-based multi-objective reinforcement learning. International Conference on Evolutionary Multi-Criterion Optimization,

Nguyen, T. T., Nguyen, N. D., Vamplew, P., Nahavandi, S., Dazeley, R., & Lim, C. P. (2020). A multi-objective deep reinforcement learning framework. Engineering Applications of Artificial Intelligence, 96. https://doi.org/10.1016/j.engappai.2020.103915

Nourelfath, M., Nahas, N., & Ben-Daya, M. (2016). Integrated preventive maintenance and production decisions for imperfect processes. Reliability Engineering & System Safety, 148, 21-31. https://doi.org/10.1016/j.ress.2015.11.015

Smith, L. N. (2018). A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820.

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double q-learning. Proceedings of the AAAI conference on artificial intelligence,

Wang, X., Wang, H., & Qi, C. (2014). Multi-agent reinforcement learning based maintenance policy for a resource constrained flow line system. Journal of Intelligent Manufacturing, 27(2), 325-333. https://doi.org/10.1007/s10845-013-0864-5

Xanthopoulos, A. S., Kiatipis, A., Koulouriotis, D. E., & Stieger, S. (2018). Reinforcement Learning-Based and Parametric Production-Maintenance Control Policies for a Deteriorating Manufacturing System. IEEE Access, 6, 576-588. https://doi.org/10.1109/access.2017.2771827

Xanthopoulos, A. S., Koulouriotis, D. E., & Botsaris, P. N. (2015). Single-stage Kanban system with deterioration failures and condition-based preventive maintenance. Reliability Engineering & System Safety, 142, 111-122. https://doi.org/10.1016/j.ress.2015.05.008

Yang, L., Ye, Z.-S., Lee, C.-G., Yang, S.-f., & Peng, R. (2019). A two-phase preventive maintenance policy considering imperfect repair and postponed replacement. European Journal of Operational Research, 274(3), 966-977.

Yao, X., Fernandez-Gaucherand, E., Fu, M. C., & Marcus, S. I. (2004). Optimal Preventive Maintenance Scheduling in Semiconductor Manufacturing. IEEE Transactions on Semiconductor Manufacturing, 17(3), 345-356. https://doi.org/10.1109/tsm.2004.831948

Zhao, S., Wang, L., & Zheng, Y. (2014). Integrating production planning and maintenance: an iterative method. Industrial Management & Data Systems, 114(2), 162-182. https://doi.org/10.1108/imds-07-2013-0314

Hung, Y. H., Lee, C. Y., Tsai, C. H., & Lu, Y. M. (2022). Constrained particle swarm optimization for health maintenance in three-mass resonant servo control system with LuGre friction model. Annals of Operations Research, 311(1), 131-150.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86245-
dc.description.abstract製造業中,預防性保養一直以來都是一項重要的議題。保養牽涉到了工廠中許多不同面向的問題,比如說良率、機台可靠性、產能穩定性等問題。如果不重視保養策略的制定,工廠將可能導致良率低落、機台無預警的故障以及上下游間機台的運作不順。而過去在工廠中常使用的方法已經逐漸無法適應複雜且具有隨機性的工廠環境,比方說機台的老化造成的機台可靠性不佳、保養時間的長短的效果不同、機台工作量容易變動等。因此在許多的研究中,如何得到良好的保養策略變成一項重要的研究項目。在本篇研究中,利用模擬的方法來還原工廠中上下游機台的環境,並且利用馬可夫決策過程來轉化問題。基於馬可夫決策過程,我們使用了深度強化學習模型來得到保養策略。除此之外,本篇研究模擬了多種不同的工廠環境以進行模型訓練,並為模型找出可解釋能力,使得我們除了能夠利用模型得到一個良好的保養策略以外,更能夠從中理解模型進行決策的關鍵並從中修正策略。zh_TW
dc.description.abstractPreventive maintenance has always been an important issue in manufacturing. Maintenance involves many different aspects of the factory, such as yield, machine reliability, and product stability. If neglecting the importance of maintenance strategies, factories may lead to low yields, machine failures, and unsmooth operation of upstream and downstream machines. However, the methods traditionally used in factories in the past have gradually been unable to adapt to the stochastic factory environment. For example, the deterioration of the machine causes the less reliability of the machine, the effect of the length of maintenance time is different, and the workload of the machine is varying. In this study, we simulate the environment which include upstream machine and downstream machine, and the Markov decision process is used to formulate the problem. Based on the Markov decision process, we use a deep reinforcement learning model to derive the maintenance policy. In addition, we apply this method in a variety of different factory environments. Besides getting a good policy, we can also get an explanation from the model and we can understand the key to the model's decision-making and modify the policy from it.en
dc.description.provenanceMade available in DSpace on 2023-03-19T23:44:30Z (GMT). No. of bitstreams: 1
U0001-2508202216421800.pdf: 2110869 bytes, checksum: 32103ca089c9f075b036149a9760aece (MD5)
Previous issue date: 2022
en
dc.description.tableofcontentsList of Contents
Chapter 1. Introduction 1
1.1. Background and Motivation 1
1.2. Research Objective 4
1.3. Research Contribution 5
1.4. Research Plan 5
Chapter 2. Literature Review 6
2.1. Different assumptions and descriptions about the manufacturing system 6
2.2. Different methodologies for the problem 9
2.3. Reinforcement learning 11
Chapter 3. Problem Description and Formulation 13
3.1. System description and assumptions 13
3.2. Problem statement 19
3.3. MDP Formulation for Problem 23
3.4. Methodology 27
Chapter 4. Simulation Study 32
4.1. Parameters Setting and Training Process 32
4.2. Evaluation of The Learned Policy 38
4.3. Overview of Experiment Result 39
4.4. Further Observation 46
Chapter 5. Conclusion and Future Direction 58
5.1. Conclusion 58
5.2. Future Direction 59
References ……………………………………………………………………………………61



List of Tables

Table 3.1 System notation 18
Table 3.2 Hyperparameter in Algorithm 1 31
Table 4.1 Environment parameters 33
Table 4.2 Parameters in the simulation case 36
Table 4.3 Model parameters 37
Table 4.4 Experiment results 44
Table 4.5 fixed state and variable 54

 
List of Figures
Figure 3.1 System architecture 17
Figure 3.2 Hypervolume 19
Figure 3.3 MDP Framework 23
Figure 4.1 Convergence plot 38
Figure 4.2 Base case's accumulated reward 40
Figure 4.3 Base case's yield loss 41
Figure 4.4 Base case’s maintenance cost 42
Figure 4.5 Base case's production loss 42
Figure 4.6 Other cases' accumulated reward 45
Figure 4.7 Base case’s action trajectory 47
Figure 4.8 Case 1 action trajectory 48
Figure 4.9 Case 2 action trajectory 49
Figure 4.10 Case 3 action trajectory 50
Figure 4.11 Case 4 action trajectory 50
Figure 4.12 Case 5 action trajectory 51
Figure 4.13 Case 6 action trajectory 51
Figure 4.14 Case 7 action trajectory 52
Figure 4.15 Case 8 action trajectory 52
Figure 4.16 Case 9 action trajectory 53
Figure 4.17 Policy map of base case’s downstream machine when upstream machine age is 10 55
Figure 4.18 Policy map of case 5 downstream machine when upstream machine age is 10 55
Figure 4.19 Policy map of base case 57
Figure 4.20 Policy map of case 5 57
-
dc.language.isoen-
dc.subject深度強化學習zh_TW
dc.subject馬可夫決策過程zh_TW
dc.subject多目標強化學習zh_TW
dc.subject機台可靠性zh_TW
dc.subject預防性保養zh_TW
dc.subject深度強化學習zh_TW
dc.subject馬可夫決策過程zh_TW
dc.subject多目標強化學習zh_TW
dc.subject機台可靠性zh_TW
dc.subject預防性保養zh_TW
dc.subjectmachine reliabilityen
dc.subjectPreventive maintenanceen
dc.subjectDeep reinforcement learningen
dc.subjectMarkov decision processen
dc.subjectmulti-objective RLen
dc.subjectmachine reliabilityen
dc.subjectPreventive maintenanceen
dc.subjectDeep reinforcement learningen
dc.subjectMarkov decision processen
dc.subjectmulti-objective RLen
dc.title基於深度強化學習之預防性保養政策於上下游可修復的耗損機台zh_TW
dc.titleDeep reinforcement learning-based preventive maintenance for repairable machines with deterioration in flow line systemen
dc.typeThesis-
dc.date.schoolyear110-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee孔令傑;林怡伶;黃奎隆;吳政鴻zh_TW
dc.contributor.oralexamcommitteeLing-Chieh Kung;Yi-Lin Lin;Kwei-Long Huang;Cheng-Hung Wuen
dc.subject.keyword預防性保養,深度強化學習,馬可夫決策過程,多目標強化學習,機台可靠性,zh_TW
dc.subject.keywordPreventive maintenance,Deep reinforcement learning,Markov decision process,multi-objective RL,machine reliability,en
dc.relation.page63-
dc.identifier.doi10.6342/NTU202202816-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2022-08-30-
dc.contributor.author-college管理學院-
dc.contributor.author-dept資訊管理學系-
dc.date.embargo-lift2025-08-25-
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-110-2.pdf2.06 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved