樂觀梯度下降上升法於相對利普希茨條件下求解最佳化問題之研究

潘宇泰; Yu-Tai Pan

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98941

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	黃道宏	zh_TW
dc.contributor.advisor	Kevin Dowhon Huang	en
dc.contributor.author	潘宇泰	zh_TW
dc.contributor.author	Yu-Tai Pan	en
dc.date.accessioned	2025-08-20T16:22:22Z	-
dc.date.available	2025-08-21	-
dc.date.copyright	2025-08-20	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-13	-
dc.identifier.citation	[1] A. Beck. First-order methods in optimization. SIAM, 2017. [2] A. Ben-Tal, T. Margalit, and A. Nemirovski. The ordered subsets mirror descent op- timization method with applications to tomography. SIAM Journal on Optimization, 12(1):79–108, 2001.   [3] L.Bottou,F.E.Curtis,andJ.Nocedal.Optimizationmethodsforlarge-scalemachine learning. SIAM review, 60(2):223–311, 2018.   [4] S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.   [5] L. M. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7(3):200–217, 1967.   [6] S. Bubeck et al. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3-4):231–357, 2015.   [7] S.Bubeck,Y.T.Lee,andM.Singh.Ageometricalternativetonesterov’saccelerated gradient descent. arXiv preprint arXiv:1506.08187, 2015.   [8] A. Cauchy et al. Méthode générale pour la résolution des systemes d'équations simultanées. Comp. Rend. Sci. Paris, 25(1847):536–538, 1847.   [9] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [10] M. B. Cohen, A. Sidford, and K. Tian. Relative lipschitzness in extragradient meth- ods and a direct recipe for acceleration. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021), pages 62–1. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2021.   [11] C. Daskalakis, A. Ilyas, V. Syrgkanis, and H. Zeng. Training gans with optimism. arXiv preprint arXiv:1711.00141, 2017.   [12] A. Defazio, F. Bach, and S. Lacoste-Julien. Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in neural information processing systems, 27, 2014.   [13] D. Drusvyatskiy, M. Fazel, and S. Roy. An optimal first order method based on optimal quadratic averaging. SIAM Journal on Optimization, 28(1):251–271, 2018.   [14] M. Frank, P. Wolfe, et al. An algorithm for quadratic programming. Naval research logistics quarterly, 3(1-2):95–110, 1956. [15] Y.-G. Hsieh, F. Iutzeler, J. Malick, and P. Mertikopoulos. On the convergence of single-call stochastic extra-gradient methods. Advances in Neural Information Pro- cessing Systems, 32, 2019.   [16] Y. Jin, A. Sidford, and K. Tian. Sharper rates for separable minimax and finite sum optimization via primal-dual extragradient methods. In Conference on Learning Theory, pages 4362–4415. PMLR, 2022. [17] R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. Advances in neural information processing systems, 26, 2013.   [18] G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Matecon, 12:747–756, 1976.   [19] S. Kullback and R. A. Leibler. On information and sufficiency. The annals of math- ematical statistics, 22(1):79–86, 1951.   [20] S. Lacoste-Julien and M. Jaggi. On the global linear convergence of frank-wolfe optimization variants. Advances in neural information processing systems, 28, 2015.   [21] T. Liang and J. Stokes. Interaction matters: A note on non-asymptotic local conver- gence of generative adversarial networks. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 907–915. PMLR, 2019.   [22] H. Lu, R. M. Freund, and Y. Nesterov. Relatively smooth convex optimization by first-order methods, and applications. SIAM Journal on Optimization, 28(1):333– 354, 2018.   [23] Z.-Q. Luo and P. Tseng. Error bounds and convergence analysis of feasible descent methods: a general approach. Annals of Operations Research, 46(1):157–178, 1993.   [24] O. Mangasarian. Solution of symmetric linear complementarity problems by iter- ative methods. Journal of Optimization Theory and Applications, 22(4):465–485, 1977.   [25] B. Martinet. Brève communication. régularisation d’inéquations variationnelles par approximations successives. Revue française d’informatique et de recherche opérationnelle. Série rouge, 4(R3):154–158, 1970.   [26] A. Mokhtari, A. Ozdaglar, and S. Pattathil. A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. In International Conference on Artificial Intelligence and Statistics, pages 1497–1507. PMLR, 2020.   [27] A. Mokhtari, A. E. Ozdaglar, and S. Pattathil. Convergence rate of o(1/k) for opti- mistic gradient and extragradient methods in smooth convex-concave saddle point problems. SIAM Journal on Optimization, 30(4):3230–3251, 2020.   [28] A. Nemirovski. Prox-method with rate of convergence o (1/t) for variational in- equalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2004.   [29] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approx- imation approach to stochastic programming. SIAM Journal on optimization, 19(4):1574–1609, 2009.   [30] A. S. Nemirovskij and D. B. Yudin. Problem complexity and method efficiency in optimization. 1983.   [31] Y. Nesterov. A method for solving the convex programming problem with conver- gence rate o(1/k2). In Dokl akad nauk Sssr, volume 269, page 543, 1983.   [32]Y.Nesterov.Dualextrapolationanditsapplicationstosolvingvariationalinequalities and related problems. Mathematical Programming, 109(2):319–344, 2007.   [33] Y. Nesterov. Introductory lectures on convex optimization: A basic course, vol- ume 87. Springer Science & Business Media, 2013.   [34] L.D.Popov.Amodificationofthearrow-hurwitzmethodofsearchforsaddlepoints. Mat. Zametki, 28(5):777–784, 1980.   [35] H. Robbins and S. Monro. A stochastic approximation method. The annals of math- ematical statistics, pages 400–407, 1951.   [36] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM journal on control and optimization, 14(5):877–898, 1976.   [37] R. T. Rockafellar. Convex analysis:(pms-28). 2015.   [38] F. Rosenblatt. The perceptron: a probabilistic model for information storage and  organization in the brain. Psychological review, 65(6):386, 1958.   [39] M. Schmidt, N. Le Roux, and F. Bach. Minimizing finite sums with the stochastic  average gradient. Mathematical Programming, 162:83–112, 2017.   [40] M. Sibony. Méthodes itératives pour les équations et inéquations aux dérivées par-  tielles non linéaires de type monotone. Calcolo, 7:65–183, 1970.   [41] P. Tseng. On linear convergence of iterative methods for the variational inequality problem. Journal of Computational and Applied Mathematics, 60(1-2):237–252, 1995.   [42] N. Wang and S. Zhang. A gradient complexity analysis for minimizing the sum of strongly convex functions with varying condition numbers. SIAM Journal on Optimization, 34(2):1374–1401, 2024.   [43] J. Zhang, M. Hong, and S. Zhang. On lower iteration complexity bounds for the convex concave saddle point problems. Mathematical Programming, 194(1):901– 935, 2022.   [44] Y. Zhang and L. Xiao. Stochastic primal-dual coordinate method for regularized empirical risk minimization. Journal of Machine Learning Research, 18(84):1–42, 2017.   [45] K. Zhou, Q. Ding, F. Shang, J. Cheng, D. Li, and Z.-Q. Luo. Direct acceleration of saga using sampled negative momentum. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1602–1610. PMLR, 2019.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98941	-
dc.description.abstract	本研究從[10]所提出的相對利普希茨觀點出發，提出兩種基於樂觀梯度下降上升法（Optimistic Gradient Descent Ascent Method, OGDA）的演算法，分別對應於一般最佳化問題與有限總和最佳化問題。第一個演算法針對光滑且強凸的目標函數，透過Fenchel共軛轉換將原問題改寫為鞍點問題，再把OGDA詮釋為近端算法（Proximal Point）的一種變形。我們證明此方法可達到一階方法中最佳的迭代複雜度O(√(L/μ)log(1/ε))。第二個演算法設計於有限總和形式的最佳化問題中，每次迭代僅隨機抽取一個光滑的子函數進行更新。我們構造了一種隨機版本的OGDA，並證明在各子函數具有不同平滑常數L_i的情況下，仍能達到收斂速率O((n+(Σ_{i=1}^n √(L_i))/√(nμ))*log((γε_0)/ε))，其中γ:=1+(Σ_{i=1}^n L_i)/(nμ)。最後，我們設計問題進行實驗驗證。結果顯示提出的OGDA變體具加速收斂效果。	zh_TW
dc.description.abstract	In this research, we develop and analyze two variants of the Optimistic Gradient Descent Ascent (OGDA) algorithm through the lens of relative Lipschitzness, a structural condition introduced by [10]. We first consider the setting of Lipschitz gradient and strongly convex minimization problems and reformulate them as structured convex-concave saddle-point problems using Fenchel duality. Within this formulation, we show that OGDA achieves the optimal iteration complexity of O(√(L/μ)log(1/ε)). Inspired by the finite-sum framework studied in [16], the second algorithm targets the finite-sum setting, where the objective function is expressed as an average of n convex component functions with Lipschitz continuous gradients. We develop a new algorithm based on OGDA, mirroring the structure used in the single-function setting. Our analysis demonstrates that in the finite-sum case, the proposed method achieves a convergence rate of O((n+(Σ_{i=1}^n √(L_i))/√(nμ))*log((γε_0)/ε)), where γ:=1+(Σ_{i=1}^n L_i)/(nμ). Finally, we validate the effectiveness of our approach through numerical experiments.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-20T16:22:22Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-20T16:22:22Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Contents Acknowledgements .......................... i 摘要 .......................... ii Abstract ................................. iii Contents ................................. v List of Figures .......................... vii Chapter 1 Introduction ............................... 1 Chapter 2 Literature Review .......................... 6 2.1 First-Order Methods for Optimization Problems .... 6 2.2 Methods for Variational Inequalities Problems .... 11 2.3 Finite-Sum Optimization .......................... 15 Chapter 3 OGDA for Solving Optimization through a Fenchel Dual Reformulation .................................... 18 3.1 Preliminaries .................................... 19 3.1.1 Problem Setup ................................ 19 3.1.2 Algorithm .................................... 21 3.2 Convergence Analysis ............................. 23 Chapter 4 OGDA for Solving Finite-Sum Optimization through a Fenchel Dual Reformulation ......................... 31 4.1 Preliminaries .................................... 32 4.1.1 Problem Setup ................................ 32 4.1.2 Algorithm .................................... 34 4.2 Convergence Analysis ............................. 37 Chapter 5 Numerical Experiments ....................... 52 5.1 OGDA via Fenchel Dual Reformulation .............. 53 5.2 Finite-Sum Setting: Stochastic OGDA via Fenchel Dual Reformulation ......................................... 58 Chapter 6 Conclusion .................................. 61 References ............................................ 63	-
dc.language.iso	en	-
dc.subject	凸最佳化	zh_TW
dc.subject	Fenchel轉換	zh_TW
dc.subject	有限總和最佳化	zh_TW
dc.subject	隨機最佳化	zh_TW
dc.subject	加速算法	zh_TW
dc.subject	Fenchel transformation	en
dc.subject	acceleration	en
dc.subject	convex optimization	en
dc.subject	stochastic optimization	en
dc.subject	finite-sum optimization	en
dc.title	樂觀梯度下降上升法於相對利普希茨條件下求解最佳化問題之研究	zh_TW
dc.title	A Research on Relative Lipschitzness in Optimistic Gradient Descent Ascent Method for Solving Optimization Problems	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	李彥寰;陳柏安;黃奎隆	zh_TW
dc.contributor.oralexamcommittee	Yen-Huan Li;Po-An Chen;Kuei-Lung Huang	en
dc.subject.keyword	凸最佳化,Fenchel轉換,有限總和最佳化,隨機最佳化,加速算法,	zh_TW
dc.subject.keyword	convex optimization,Fenchel transformation,finite-sum optimization,stochastic optimization,acceleration,	en
dc.relation.page	68	-
dc.identifier.doi	10.6342/NTU202504117	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-08-14	-
dc.contributor.author-college	工學院	-
dc.contributor.author-dept	工業工程學研究所	-
dc.date.embargo-lift	2025-08-21	-
顯示於系所單位：	工業工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf	895.64 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。