Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93181
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李家岩zh_TW
dc.contributor.advisorChia-Yen Leeen
dc.contributor.author黃奕滔zh_TW
dc.contributor.authorYi-Tao Huangen
dc.date.accessioned2024-07-23T16:09:55Z-
dc.date.available2024-07-24-
dc.date.copyright2024-07-23-
dc.date.issued2023-
dc.date.submitted2024-07-19-
dc.identifier.citationAndriushchenko, M., & Flammarion, N. (2020). Understanding and Improving Fast Adversarial Training. Advances in Neural Information Processing Systems, 33, 16048–16059. https://proceedings.neurips.cc/paper/2020/hash/b8ce47761ed7b3b6f48b583350b7f9e4-Abstract.html
Ben-Tal, A., & Nemirovski, A. (1999). Robust solutions of uncertain linear programs. Operations Research Letters, 25(1), 1–13. https://doi.org/10.1016/S0167-6377(99)00016-4
Bertsimas, D., Brown, D. B., & Caramanis, C. (2011). Theory and Applications of Robust Optimization. SIAM Review, 53(3), 464–501. https://doi.org/10.1137/080734510
Bertsimas, D., & Sim, M. (2004). The Price of Robustness. Operations Research, 52(1), 35–53. https://doi.org/10.1287/opre.1030.0065
Bonfill, A., Bagajewicz, M., Espuña, A., & Puigjaner, L. (2004). Risk Management in the Scheduling of Batch Plants under Uncertain Market Demand. Industrial & Engineering Chemistry Research, 43(3), 741.
Bonfill, A., Espuña, A., & Puigjaner, L. (2005). Addressing Robustness in Scheduling Batch Processes with Uncertain Operation Times. Industrial & Engineering Chemistry Research, 44(5), 1524–1534. https://doi.org/10.1021/ie049732g
Chang, J., Yu, D., Hu, Y., He, W., & Yu, H. (2022). Deep Reinforcement Learning for Dynamic Flexible Job Shop Scheduling with Random Job Arrival. Processes, 10, 760. https://doi.org/10.3390/pr10040760
Chassein, A., & Goerigk, M. (2016). Performance Analysis in Robust Optimization. In M. Doumpos, C. Zopounidis, & E. Grigoroudis (Eds.), Robustness Analysis in Decision Aiding, Optimization, and Analytics (Vol. 241, pp. 145–170). Springer International Publishing. https://doi.org/10.1007/978-3-319-33121-8_7
Grossmann, I. E., Apap, R. M., Calfa, B. A., García-Herreros, P., & Zhang, Q. (2016). Recent advances in mathematical programming techniques for the optimization of process systems under uncertainty. Computers & Chemical Engineering, 91, 3–14. https://doi.org/10.1016/j.compchemeng.2016.03.002
Gupta, D., & Maravelias, C. T. (2016). On deterministic online scheduling: Major considerations, paradoxes and remedies. Computers & Chemical Engineering, 94, 312–330. https://doi.org/10.1016/j.compchemeng.2016.08.006
Gurobi Optimization LLC. (n.d.). Gurobi Optimizer Reference Manual.
Han, D., Kozuno, T., Luo, X., Chen, Z., Doya, K., Yang, Y., & Li, D. (2022, March 16). Variational oracle guiding for reinforcement learning.
Hoek, R. (2020). Research opportunities for a more resilient post-COVID-19 supply chain – closing the gap between research findings and industry practice. International Journal of Operations & Production Management, ahead-of-print. https://doi.org/10.1108/IJOPM-03-2020-0165
Hu, Z., Ramaraj, G., & Hu, G. (2020). Production planning with a two-stage stochastic programming model in a kitting facility under demand and yield uncertainties. International Journal of Management Science and Engineering Management, 15(3), 237–246. https://doi.org/10.1080/17509653.2019.1710301
Hubbs, C. D., Li, C., Sahinidis, N. V., Grossmann, I. E., & Wassick, J. M. (2020). A deep reinforcement learning approach for chemical production scheduling. Computers & Chemical Engineering, 141, 106982. https://doi.org/10.1016/j.compchemeng.2020.106982
Janak, S. L., Floudas, C. A., Kallrath, J., & Vormbrock, N. (2006). Production Scheduling of a Large-Scale Industrial Batch Plant. II. Reactive Scheduling. Industrial & Engineering Chemistry Research, 45(25), 8253–8269. https://doi.org/10.1021/ie0600590
Janak, S. L., Lin, X., & Floudas, C. A. (2007). A new robust optimization approach for scheduling under uncertainty: II. Uncertainty with known probability distribution. Computers & Chemical Engineering, 31(3), 171–195. https://doi.org/10.1016/j.compchemeng.2006.05.035
Jia, Z., & Ierapetritou, M. (2006a). Generate Pareto Optimal Solutions of Scheduling Problems Using Normal Boundary Intersection Technique. Computers & Chemical Engineering, 1, 268–280.
Jia, Z., & Ierapetritou, M. (2006b). Uncertainty analysis on the righthand side for MILP problems. AIChE Journal, 52, 2486–2495. https://doi.org/10.1002/aic.10842
Jung, J. Y., Blau, G., Pekny, J. F., Reklaitis, G. V., & Eversdyk, D. (2004). A simulation based optimization approach to supply chain management under demand uncertainty. Computers & Chemical Engineering, 28(10), 2087–2106. https://doi.org/10.1016/j.compchemeng.2004.06.006
Lee, C.-Y., Huang, Y.-T., & Chen, P.-J. (2024). Robust-optimization-guiding deep reinforcement learning for chemical material production scheduling. Computers & Chemical Engineering, 187, 108745. https://doi.org/10.1016/j.compchemeng.2024.108745
Lenstra, J. K., Rinnooy Kan, A. H. G., & Brucker, P. (1977). Complexity of Machine Scheduling Problems. In P. L. Hammer, E. L. Johnson, B. H. Korte, & G. L. Nemhauser (Eds.), Annals of Discrete Mathematics (Vol. 1, pp. 343–362). Elsevier. https://doi.org/10.1016/S0167-5060(08)70743-X
Li, J., Koyamada, S., Ye, Q., Liu, G., Wang, C., Yang, R., Zhao, L., Qin, T., Liu, T.-Y., & Hon, H.-W. (2020). Suphx: Mastering Mahjong with Deep Reinforcement Learning (arXiv:2003.13590). arXiv. http://arxiv.org/abs/2003.13590
Li, Z., & Ierapetritou, M. (2008). Process scheduling under uncertainty: Review and challenges. Computers & Chemical Engineering, 32, 715–727. https://doi.org/10.1016/j.compchemeng.2007.03.001
Lin, X., Janak, S. L., & Floudas, C. A. (2004). A new robust optimization approach for scheduling under uncertainty: I. Bounded uncertainty. Computers & Chemical Engineering, 28(6), 1069–1085. https://doi.org/10.1016/j.compchemeng.2003.09.020
Mihoubi, B., Bouzouia, B., & Gaham, M. (2021). Reactive scheduling approach for solving a realistic flexible job shop scheduling problem. International Journal of Production Research, 59(19), 5790–5808. https://doi.org/10.1080/00207543.2020.1790686
Nemirovski, A. (2019). Lectures on Robust Convex Optimization [Dataset]. https://doi.org/10.1287/e356790b-ddcc-4920-a645-a2d08c6334bb
Petrovic, D., & Duenas, A. (2006). A fuzzy logic based production scheduling/rescheduling in the presence of uncertain disruptions. Fuzzy Sets and Systems, 157(16), 2273–2285. https://doi.org/10.1016/j.fss.2006.04.009
Raghunathan, A., Xie, S. M., Yang, F., Duchi, J. C., & Liang, P. (2019). Adversarial Training Can Hurt Generalization (arXiv:1906.06032). arXiv. http://arxiv.org/abs/1906.06032
Riedmiller, S., & Riedmiller, M. (n.d.). A neural reinforcement learning approach to learn local dispatching policies in production scheduling.
Sahinidis, N. V. (2004). Optimization under uncertainty: State-of-the-art and opportunities. Computers & Chemical Engineering, 28(6–7), 971–983. https://doi.org/10.1016/j.compchemeng.2003.09.017
Sand, G., & Engell, S. (2004). Modeling and solving real-time scheduling problems by stochastic integer programming. Computers & Chemical Engineering, 28(6), 1087–1103. https://doi.org/10.1016/j.compchemeng.2003.09.009
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489. https://doi.org/10.1038/nature16961
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. MIT Press.
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy Gradient Methods for Reinforcement Learning with Function Approximation. Advances in Neural Information Processing Systems, 12. https://proceedings.neurips.cc/paper/1999/hash/464d828b85b0bed98e80ade0a5c43b0f-Abstract.html
Zhang, S., & Sutton, R. S. (2018). A Deeper Look at Experience Replay (arXiv:1712.01275). arXiv. https://doi.org/10.48550/arXiv.1712.01275
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93181-
dc.description.abstract在本文中,我們提出了一種新的強化學習訓練框架,專門用於動態排程單階段多產品化學反應器。該方法透過最佳化方法的合作,將策略梯度收斂時所遇到的局部最優挑戰減輕,優化了強化學習模型目標值以及它的穩定性。除此之外,我們進一步透過將穩健最佳化作為指導引擎,提高模型對參數誤差的穩健性,促進了在需求與生產效率不穩定的情況下所能採用的較保守之決策風格。實驗結果證明了我們所提出方法的有效性,與簡單的演員-評論家方法相比,在同樣的參數設計下多個指標上都有所提高。
本研究的研究貢獻在深度強化學習與最佳化方法上的整合有所突破,也透過本框架提供新的強化學習訓練框架,可以應用於其他領域問題之中。
zh_TW
dc.description.abstractIn this paper, we present a novel guiding framework for the training phase of Reinforcement Learning (RL) models, specifically tailored for dynamic scheduling of a single-stage multi-product chemical reactor. Our approach addresses the challenge of local optima in policy gradient methods by integrating optimization methods, enhancing both the objective value and stability of the RL model. We further enhance the robustness of our model against parameter distortions by incorporating robust optimization as a guiding engine, promoting a more conservative decision-making style. Our experimental results demonstrate the efficacy of our approach, with several metrics compared to simple Actor-Critic methods.
Our work thus serves as an advancement in the integration of Deep Reinforcement Learning and optimization methods, hopefully opening new avenues for research and application in dynamic scheduling and beyond.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-23T16:09:55Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-07-23T16:09:55Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAbstract ii
摘要 i
Table of contents iii
List of figures v
List of tables vi
Chapter 1 Introduction 1
1.1 Background and Motivation 1
1.2 Research Objective 4
1.3 Thesis Architecture 5
Chapter 2 Literature Review 6
2.1 Scheduling with Uncertainties 6
2.2 Chemical Production Scheduling 9
2.3 Oracle Guiding DRL 12
Chapter 3 Methodologies 15
3.1 Research Framework 15
3.2 Problem Definition 16
3.2.1 Problem Description 16
3.2.2 Mathematical Model 18
3.3 Reinforcement learning model 24
3.3.1 Model design 24
3.3.2 State definition 26
3.3.3 Reward definition 27
3.4 Proposed OR-Guiding Algorithm 27
3.4.1 Training Workflow 27
3.4.2 Algorithm Implementation 29
Chapter 4 Numerical Studies 34
4.1 Environment Setup 34
4.2 Data Simulation 35
4.2.1 Basic Configurations 35
4.2.2 Parameter Definitions 35
4.3 Model Description and Sensitivity Analysis 39
4.3.1 Model Descriptions 39
4.3.2 Sensitivity Analysis 40
4.4 Solution Value Analysis 43
4.4.1 Descriptive Statistics 43
4.4.2 Evaluation Metrics 45
4.5 Execution Time Analysis 48
4.6 Gantt Similarity Analysis 50
Chapter 5 Conclusion and Future Works 53
5.1 Conclusion 53
5.2 Future Works 53
References 55
-
dc.language.isoen-
dc.title使用穩健最佳化作為深度強化學習指引 – 以化學物料生產排程作為應用案例zh_TW
dc.titleRobust Optimization as an Oracle Guiding Method for Deep Reinforcement Learning – an Application in Chemical Material Production Scheduling Problemen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee吳政鴻;魏志平;孔令傑zh_TW
dc.contributor.oralexamcommitteeCheng-Hung Wu;Chih-Ping Wei;Ling-Chieh Kungen
dc.subject.keyword製造排程,強化學習,穩健最佳化,化工製造,隨機規劃,zh_TW
dc.subject.keywordproduction scheduling,reinforcement learning,robust optimization,chemical production,stochastic programming,en
dc.relation.page61-
dc.identifier.doi10.6342/NTU202401914-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-07-19-
dc.contributor.author-college管理學院-
dc.contributor.author-dept資訊管理學系-
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf1.58 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved