Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100949| Title: | 隨機擬牛頓方法用於深度神經網路的訓練 Stochastic Quasi-Newton Methods for Training Deep Neural Networks |
| Authors: | 黃子軒 Zih-Syuan Huang |
| Advisor: | 林智仁 Chih-Jen Lin |
| Keyword: | 深度學習,隨機最佳化擬牛頓法 Deep Learning,Stochastic OptimizationQuasi-Newton Methods |
| Publication Year : | 2025 |
| Degree: | 碩士 |
| Abstract: | 擬牛頓法因能利用二階資訊而無需實際計算黑森矩陣,在最佳化中已被證明相當有效。 然而,將其應用於隨機最佳化,特別是在深度學習情境中,仍面臨挑戰,原因在於梯度估計存在雜訊以及目標函數具非凸性。 在本文中,我們提出了一種專為訓練深度神經網路設計的隨機有限記憶 BFGS(LBFGS)方法。 我們的方法引入了一種新穎的曲率選擇策略,透過更新頻率機制來挑選曲率最大的曲率對,有效解決隨機性與非凸性問題。 此外,我們結合了動量方法,以進一步提升收斂速度。 實驗結果顯示,我們的方法在標準的凸與非凸影像分類基準資料集,不僅顯著優於某一個現有的隨機 LBFGS(oLBFGS)方法,還能與廣泛使用的深度學習最佳化方法,如動量隨機梯度下降(SGDM)、Adam 與 Shampoo表現相似。 Quasi-Newton methods have proven to be effective for optimization due to their use of second-order information without explicitly computing Hessian matrices. However, their adaptation to stochastic optimization, particularly in deep learning contexts, remains challenging due to noisy gradient estimates and nonconvex objectives. In this thesis, we propose a stochastic limited-memory BFGS (LBFGS) optimizer designed specifically for training deep neural networks. Our method introduces a novel curvature selection strategy that utilizes an update frequency mechanism to select curvature pairs exhibiting the highest curvature, effectively addressing stochastic and nonconvex issues. Additionally, we integrate momentum to speed up the convergence. Experimental results demonstrate that our approach significantly outperforms the existing stochastic LBFGS method (oLBFGS) and remains competitive with widely used deep learning optimizers such as SGD with momentum (SGDM), Adam, and Shampoo on standard convex and non-convex image classification benchmarks. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100949 |
| DOI: | 10.6342/NTU202504473 |
| Fulltext Rights: | 同意授權(全球公開) |
| metadata.dc.date.embargo-lift: | 2025-11-27 |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-114-1.pdf | 805.48 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
