Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7266
Title: 神經元消失:影響深層神經網路之表現能力,並使其難以訓練的新現象
Vanishing Nodes: The Phenomena That Affects The Representation Power and The Training Difficulty of Deep Neural Networks
Authors: Wen-Yu Chang
張文于
Advisor: 林宗男(Tsung-Nan Lin)
Keyword: 深度學習,梯度消失,機器學習理論,表現能力,神經網路架構,網路訓練問題,正交參數初始化,冗餘神經元,隨機矩陣,
Deep learning,Vanishing gradient,Learning theory,Representation power,Network architecture,Training difficulty,Orthogonal initialization,Node redundancy,Random matrices,
Publication Year : 2019
Degree: 碩士
Abstract: 梯度爆炸/消失,一直被認為是訓練深層神經網路的一大挑戰。在這篇論文裡,我們發現一種被稱為「神經元消失 (Vanishing Nodes)」的新現象同樣也會使訓練更加困難。當神經網路的深度增加,神經元彼此之間的會呈現高度相關。這種行為會導致神經元之間的相似程度提高。也就是隨著神經網路變深,網路內的神經元冗餘程度會提高。我們把這個問題稱為「神經元消失 (Vanishing Nodes)」。可以藉由神經網路的相關參數來對神經元消失的程度做推算;結果可以得出神經元消失的程度與網路深度成正比、與網路寬度成反比。從數值分析的結果呈現出:在反向傳播算法的訓練下,神經元消失的現象會變得更明顯。我們也提出:神經元消失是除了梯度爆炸/消失以外,訓練深層神經網路的另一道難關。
It is well known that the problem of vanishing/exploding gradients creates a challenge when training deep networks. In this paper, we show another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the depth of a neural network increases, the network's hidden nodes show more highly correlated behavior. This correlated behavior results in great similarity between these nodes. The redundancy of hidden nodes thus increases as the network becomes deeper. We call this problem 'Vanishing Nodes.' This behavior of vanishing nodes can be characterized quantitatively by the network parameters, which is shown analytically to be proportional to the network depth and inversely proportional to the network width. The numerical results suggest that the degree of vanishing nodes will become more evident during back-propagation training. Finally, we show that vanishing/exploding gradients and vanishing nodes are two different challenges that increase the difficulty of training deep neural networks.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7266
DOI: 10.6342/NTU201901446
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2024-08-05
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-108-1.pdf17.33 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved