請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98941完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 黃道宏 | zh_TW |
| dc.contributor.advisor | Kevin Dowhon Huang | en |
| dc.contributor.author | 潘宇泰 | zh_TW |
| dc.contributor.author | Yu-Tai Pan | en |
| dc.date.accessioned | 2025-08-20T16:22:22Z | - |
| dc.date.available | 2025-08-21 | - |
| dc.date.copyright | 2025-08-20 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-08-13 | - |
| dc.identifier.citation | [1] A. Beck. First-order methods in optimization. SIAM, 2017.
[2] A. Ben-Tal, T. Margalit, and A. Nemirovski. The ordered subsets mirror descent op- timization method with applications to tomography. SIAM Journal on Optimization, 12(1):79–108, 2001. [3] L.Bottou,F.E.Curtis,andJ.Nocedal.Optimizationmethodsforlarge-scalemachine learning. SIAM review, 60(2):223–311, 2018. [4] S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004. [5] L. M. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7(3):200–217, 1967. [6] S. Bubeck et al. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3-4):231–357, 2015. [7] S.Bubeck,Y.T.Lee,andM.Singh.Ageometricalternativetonesterov’saccelerated gradient descent. arXiv preprint arXiv:1506.08187, 2015. [8] A. Cauchy et al. Méthode générale pour la résolution des systemes d'équations simultanées. Comp. Rend. Sci. Paris, 25(1847):536–538, 1847. [9] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [10] M. B. Cohen, A. Sidford, and K. Tian. Relative lipschitzness in extragradient meth- ods and a direct recipe for acceleration. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021), pages 62–1. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2021. [11] C. Daskalakis, A. Ilyas, V. Syrgkanis, and H. Zeng. Training gans with optimism. arXiv preprint arXiv:1711.00141, 2017. [12] A. Defazio, F. Bach, and S. Lacoste-Julien. Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in neural information processing systems, 27, 2014. [13] D. Drusvyatskiy, M. Fazel, and S. Roy. An optimal first order method based on optimal quadratic averaging. SIAM Journal on Optimization, 28(1):251–271, 2018. [14] M. Frank, P. Wolfe, et al. An algorithm for quadratic programming. Naval research logistics quarterly, 3(1-2):95–110, 1956. [15] Y.-G. Hsieh, F. Iutzeler, J. Malick, and P. Mertikopoulos. On the convergence of single-call stochastic extra-gradient methods. Advances in Neural Information Pro- cessing Systems, 32, 2019. [16] Y. Jin, A. Sidford, and K. Tian. Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods. In Conference on Learning Theory, pages 4362–4415. PMLR, 2022. [17] R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. Advances in neural information processing systems, 26, 2013. [18] G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Matecon, 12:747–756, 1976. [19] S. Kullback and R. A. Leibler. On information and sufficiency. The annals of math- ematical statistics, 22(1):79–86, 1951. [20] S. Lacoste-Julien and M. Jaggi. On the global linear convergence of frank-wolfe optimization variants. Advances in neural information processing systems, 28, 2015. [21] T. Liang and J. Stokes. Interaction matters: A note on non-asymptotic local conver- gence of generative adversarial networks. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 907–915. PMLR, 2019. [22] H. Lu, R. M. Freund, and Y. Nesterov. Relatively smooth convex optimization by first-order methods, and applications. SIAM Journal on Optimization, 28(1):333– 354, 2018. [23] Z.-Q. Luo and P. Tseng. Error bounds and convergence analysis of feasible descent methods: a general approach. Annals of Operations Research, 46(1):157–178, 1993. [24] O. Mangasarian. Solution of symmetric linear complementarity problems by iter- ative methods. Journal of Optimization Theory and Applications, 22(4):465–485, 1977. [25] B. Martinet. Brève communication. régularisation d’inéquations variationnelles par approximations successives. Revue française d’informatique et de recherche opérationnelle. Série rouge, 4(R3):154–158, 1970. [26] A. Mokhtari, A. Ozdaglar, and S. Pattathil. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. In International Conference on Artificial Intelligence and Statistics, pages 1497–1507. PMLR, 2020. [27] A. Mokhtari, A. E. Ozdaglar, and S. Pattathil. Convergence rate of o(1/k) for opti- mistic gradient and extragradient methods in smooth convex-concave saddle point problems. SIAM Journal on Optimization, 30(4):3230–3251, 2020. [28] A. Nemirovski. Prox-method with rate of convergence o (1/t) for variational in- equalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004. [29] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approx- imation approach to stochastic programming. SIAM Journal on optimization, 19(4):1574–1609, 2009. [30] A. S. Nemirovskij and D. B. Yudin. Problem complexity and method efficiency in optimization. 1983. [31] Y. Nesterov. A method for solving the convex programming problem with conver- gence rate o(1/k2). In Dokl akad nauk Sssr, volume 269, page 543, 1983. [32]Y.Nesterov.Dualextrapolationanditsapplicationstosolvingvariationalinequalities and related problems. Mathematical Programming, 109(2):319–344, 2007. [33] Y. Nesterov. Introductory lectures on convex optimization: A basic course, vol- ume 87. Springer Science & Business Media, 2013. [34] L.D.Popov.Amodificationofthearrow-hurwitzmethodofsearchforsaddlepoints. Mat. Zametki, 28(5):777–784, 1980. [35] H. Robbins and S. Monro. A stochastic approximation method. The annals of math- ematical statistics, pages 400–407, 1951. [36] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM journal on control and optimization, 14(5):877–898, 1976. [37] R. T. Rockafellar. Convex analysis:(pms-28). 2015. [38] F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6):386, 1958. [39] M. Schmidt, N. Le Roux, and F. Bach. Minimizing finite sums with the stochastic average gradient. Mathematical Programming, 162:83–112, 2017. [40] M. Sibony. Méthodes itératives pour les équations et inéquations aux dérivées par- tielles non linéaires de type monotone. Calcolo, 7:65–183, 1970. [41] P. Tseng. On linear convergence of iterative methods for the variational inequality problem. Journal of Computational and Applied Mathematics, 60(1-2):237–252, 1995. [42] N. Wang and S. Zhang. A gradient complexity analysis for minimizing the sum of strongly convex functions with varying condition numbers. SIAM Journal on Optimization, 34(2):1374–1401, 2024. [43] J. Zhang, M. Hong, and S. Zhang. On lower iteration complexity bounds for the convex concave saddle point problems. Mathematical Programming, 194(1):901– 935, 2022. [44] Y. Zhang and L. Xiao. Stochastic primal-dual coordinate method for regularized empirical risk minimization. Journal of Machine Learning Research, 18(84):1–42, 2017. [45] K. Zhou, Q. Ding, F. Shang, J. Cheng, D. Li, and Z.-Q. Luo. Direct acceleration of saga using sampled negative momentum. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1602–1610. PMLR, 2019. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98941 | - |
| dc.description.abstract | 本研究從[10]所提出的相對利普希茨觀點出發,提出兩種基於樂觀梯度下降上升法(Optimistic Gradient Descent Ascent Method, OGDA)的演算法,分別對應於一般最佳化問題與有限總和最佳化問題。
第一個演算法針對光滑且強凸的目標函數,透過Fenchel共軛轉換將原問題改寫為鞍點問題,再把OGDA詮釋為近端算法(Proximal Point)的一種變形。我們證明此方法可達到一階方法中最佳的迭代複雜度O(√(L/μ)log(1/ε))。 第二個演算法設計於有限總和形式的最佳化問題中,每次迭代僅隨機抽取一個光滑的子函數進行更新。我們構造了一種隨機版本的OGDA,並證明在各子函數具有不同平滑常數L_i的情況下,仍能達到收斂速率O((n+(Σ_{i=1}^n √(L_i))/√(nμ))*log((γε_0)/ε)),其中γ:=1+(Σ_{i=1}^n L_i)/(nμ)。 最後,我們設計問題進行實驗驗證。結果顯示提出的OGDA變體具加速收斂效果。 | zh_TW |
| dc.description.abstract | In this research, we develop and analyze two variants of the Optimistic Gradient Descent Ascent (OGDA) algorithm through the lens of relative Lipschitzness, a structural condition introduced by [10]. We first consider the setting of Lipschitz gradient and strongly convex minimization problems and reformulate them as structured convex-concave saddle-point problems using Fenchel duality. Within this formulation, we show that OGDA achieves the optimal iteration complexity of O(√(L/μ)log(1/ε)).
Inspired by the finite-sum framework studied in [16], the second algorithm targets the finite-sum setting, where the objective function is expressed as an average of n convex component functions with Lipschitz continuous gradients. We develop a new algorithm based on OGDA, mirroring the structure used in the single-function setting. Our analysis demonstrates that in the finite-sum case, the proposed method achieves a convergence rate of O((n+(Σ_{i=1}^n √(L_i))/√(nμ))*log((γε_0)/ε)), where γ:=1+(Σ_{i=1}^n L_i)/(nμ). Finally, we validate the effectiveness of our approach through numerical experiments. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-20T16:22:22Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-08-20T16:22:22Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Contents
Acknowledgements .......................... i 摘要 .......................... ii Abstract ................................. iii Contents ................................. v List of Figures .......................... vii Chapter 1 Introduction ............................... 1 Chapter 2 Literature Review .......................... 6 2.1 First-Order Methods for Optimization Problems .... 6 2.2 Methods for Variational Inequalities Problems .... 11 2.3 Finite-Sum Optimization .......................... 15 Chapter 3 OGDA for Solving Optimization through a Fenchel Dual Reformulation .................................... 18 3.1 Preliminaries .................................... 19 3.1.1 Problem Setup ................................ 19 3.1.2 Algorithm .................................... 21 3.2 Convergence Analysis ............................. 23 Chapter 4 OGDA for Solving Finite-Sum Optimization through a Fenchel Dual Reformulation ......................... 31 4.1 Preliminaries .................................... 32 4.1.1 Problem Setup ................................ 32 4.1.2 Algorithm .................................... 34 4.2 Convergence Analysis ............................. 37 Chapter 5 Numerical Experiments ....................... 52 5.1 OGDA via Fenchel Dual Reformulation .............. 53 5.2 Finite-Sum Setting: Stochastic OGDA via Fenchel Dual Reformulation ......................................... 58 Chapter 6 Conclusion .................................. 61 References ............................................ 63 | - |
| dc.language.iso | en | - |
| dc.subject | 凸最佳化 | zh_TW |
| dc.subject | Fenchel轉換 | zh_TW |
| dc.subject | 有限總和最佳化 | zh_TW |
| dc.subject | 隨機最佳化 | zh_TW |
| dc.subject | 加速算法 | zh_TW |
| dc.subject | Fenchel transformation | en |
| dc.subject | acceleration | en |
| dc.subject | convex optimization | en |
| dc.subject | stochastic optimization | en |
| dc.subject | finite-sum optimization | en |
| dc.title | 樂觀梯度下降上升法於相對利普希茨條件下求解最佳化問題之研究 | zh_TW |
| dc.title | A Research on Relative Lipschitzness in Optimistic Gradient Descent Ascent Method for Solving Optimization Problems | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 李彥寰;陳柏安;黃奎隆 | zh_TW |
| dc.contributor.oralexamcommittee | Yen-Huan Li;Po-An Chen;Kuei-Lung Huang | en |
| dc.subject.keyword | 凸最佳化,Fenchel轉換,有限總和最佳化,隨機最佳化,加速算法, | zh_TW |
| dc.subject.keyword | convex optimization,Fenchel transformation,finite-sum optimization,stochastic optimization,acceleration, | en |
| dc.relation.page | 68 | - |
| dc.identifier.doi | 10.6342/NTU202504117 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-08-14 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 工業工程學研究所 | - |
| dc.date.embargo-lift | 2025-08-21 | - |
| 顯示於系所單位: | 工業工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 895.64 kB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
