隨機擬牛頓方法用於深度神經網路的訓練

黃子軒; Zih-Syuan Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100949

標題:	隨機擬牛頓方法用於深度神經網路的訓練 Stochastic Quasi-Newton Methods for Training Deep Neural Networks
作者:	黃子軒 Zih-Syuan Huang
指導教授:	林智仁 Chih-Jen Lin
關鍵字:	深度學習,隨機最佳化擬牛頓法 Deep Learning,Stochastic OptimizationQuasi-Newton Methods
出版年 :	2025
學位:	碩士
摘要:	擬牛頓法因能利用二階資訊而無需實際計算黑森矩陣，在最佳化中已被證明相當有效。然而，將其應用於隨機最佳化，特別是在深度學習情境中，仍面臨挑戰，原因在於梯度估計存在雜訊以及目標函數具非凸性。在本文中，我們提出了一種專為訓練深度神經網路設計的隨機有限記憶 BFGS（LBFGS）方法。我們的方法引入了一種新穎的曲率選擇策略，透過更新頻率機制來挑選曲率最大的曲率對，有效解決隨機性與非凸性問題。此外，我們結合了動量方法，以進一步提升收斂速度。實驗結果顯示，我們的方法在標準的凸與非凸影像分類基準資料集，不僅顯著優於某一個現有的隨機 LBFGS（oLBFGS）方法，還能與廣泛使用的深度學習最佳化方法，如動量隨機梯度下降(SGDM)、Adam 與 Shampoo表現相似。 Quasi-Newton methods have proven to be effective for optimization due to their use of second-order information without explicitly computing Hessian matrices. However, their adaptation to stochastic optimization, particularly in deep learning contexts, remains challenging due to noisy gradient estimates and nonconvex objectives. In this thesis, we propose a stochastic limited-memory BFGS (LBFGS) optimizer designed specifically for training deep neural networks. Our method introduces a novel curvature selection strategy that utilizes an update frequency mechanism to select curvature pairs exhibiting the highest curvature, effectively addressing stochastic and nonconvex issues. Additionally, we integrate momentum to speed up the convergence. Experimental results demonstrate that our approach significantly outperforms the existing stochastic LBFGS method (oLBFGS) and remains competitive with widely used deep learning optimizers such as SGD with momentum (SGDM), Adam, and Shampoo on standard convex and non-convex image classification benchmarks.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100949
DOI:	10.6342/NTU202504473
全文授權:	同意授權(全球公開)
電子全文公開日期:	2025-11-27
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf	805.48 kB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。