Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89141
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor丁肇隆zh_TW
dc.contributor.advisorChao-Lung Tingen
dc.contributor.author周宥辰zh_TW
dc.contributor.authorYu-CHen Chouen
dc.date.accessioned2023-08-16T17:18:06Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-16-
dc.date.issued2023-
dc.date.submitted2023-08-10-
dc.identifier.citation1. Krizhevsky, A., I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012. 25.
2. Simonyan, K. and A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
3. Szegedy, C., et al. Going deeper with convolutions. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
4. Hubel, D.H. and T.N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 1962. 160.
5. Fukushima, K., Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980. 36: p. 193-202.
6. LeCun, Y., L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998. 86(11): p. 2278-2324.
7. He, K., X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
8. Zoph, B. and Q.V. Le, Neural Architecture Search with Reinforcement Learning. ArXiv, 2016. abs/1611.01578.
9. Zoph, B., V. Vasudevan, J. Shlens, and Q.V. Le. Learning transferable architectures for scalable image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
10. Liu, H., et al., Hierarchical Representations for Efficient Architecture Search. ArXiv, 2017. abs/1711.00436.
11. Andrychowicz, M., et al., Learning to learn by gradient descent by gradient descent. Advances in neural information processing systems, 2016. 29.
12. Wang, J., J. Wu, H. Bai, and J. Cheng. M-nas: Meta neural architecture search. in Proceedings of the AAAI Conference on Artificial Intelligence. 2020.
13. Liu, H., K. Simonyan, and Y. Yang, Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
14. Chen, X., L. Xie, J. Wu, and Q. Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. in Proceedings of the IEEE/CVF international conference on computer vision. 2019.
15. Sutskever, I., J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. in International conference on machine learning. 2013. PMLR.
16. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
17. He, K., X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. in Proceedings of the IEEE international conference on computer vision. 2015.
18. 曾子芸, 有限條件下的最佳網路架構用於分類之研究, in 工程科學及海洋工程學研究所. 2021, 國立臺灣大學. p. 1-50.datasets from ImageNet than the baseline AlexNet architecture.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89141-
dc.description.abstract近年來,深度學習在許多領域,如影像處理、自然語言處理等方面取得了優秀的成果。然而,要設計出優秀的神經網絡架構,高度地需要該領域的知識和反覆實驗,這使得許多研究人員對機器學習望而卻步。為了簡化這一過程,神經架構搜索(NAS)得流行起來。我們首先以 AlexNet 的網路架構來深入了解不同初始化權重的方法對於網路架構性能的影響,這讓我們對於定義搜索空間提供了更深入的調查,均勻分佈初始化方法配合適當的上下界能夠產生出優異的結果。之後,我們探討卷積核尺寸(kernel size)對於模型效能的影響, 3x3 卷積核尺寸效能比大卷積核尺寸的效能還佳,所以我們以 3x3 的卷積核尺寸作為下一階段實驗的基礎。最後我們定義一個廣的模型,透過 NAS 來找到他的最佳子網路,當作重新訓練之網路架構,最佳子網路均有著更佳的準確率,以及更少的參數量。總結而言,我們測試了多組資料集,以及多種不同的隨機種子,證明所提出的算法能夠自動搜索出在特定資料集中,比基準AlexNet 架構更輕量且準確的最佳架構。zh_TW
dc.description.abstractIn recent years, deep learning has achieved outstanding results in many fields, such as computer vision, natural language processing. However, the process of designing an exceptional neural network architecture highly requires the domain knowledge and repeated experiments, which keeps the many researchers away from the machine learning approach. In order to simplify this process, neural architecture search (NAS) has become popular recently. We first delve into the significance of initialization methods on the weights of AlexNet and the kernel size in the convolutional layer, which provides us a further insight to define our search space. The
initialization method of uniform distribution with a proper upper bound and lower bound is able to produce exceptional results. Then we investigate the importance of kernel size. Notably, the adoption of a 3×3 kernel size operation as the sole operation within a cell proves to be highly prominent. Last but not least, our proposed algorithm can automatically search optimal architecture which is more light-weight and accurate on specific datasets from ImageNet than the baseline AlexNet architecture.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-16T17:18:06Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-16T17:18:06Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 i
中文摘要 ii
Abstract iii
List of Figures vi
List of Tables ix
1 Introduction 1
1.1 Motivation 2
1.2 Thesis overview 2
2 Related work 4
2.1 Convolutional neural network 4
2.2 Neural architecture search 6
3 Proposed methods 8
3.1 Environmental settings 8
3.1.1 Hardware and software 9
3.1.2 Datasets 9
3.1.3 Optimizers 12
3.1.4 Batch size and the number of epochs 12
3.1.5 Data preprocessing 12
3.1.6 Evaluation 13
3.2 Initialization methods 14
3.3 Operation choice 14
3.3.1 General architecture 15
3.3.2 Subgraph connection 16
3.3.3 Training protocol 17
3.3.4 Pick the model by its highest validation accuracy 19
3.4 Proposed NAS algorithm 20
3.4.1 NAS methods 21
3.4.2 Training algorithm 24
4 Evaluation 26
4.1 Different initialization methods 26
4.2 Operation choice 30
4.2.1 Architecture “cell0_4.cell4_5” 30
4.2.2 Architecture “cell0_1.cell1_2.cell2_5” 34
4.2.3 Pick the model by its maximum validation accuracy 36
4.3 NAS evaluation 43
5 Conclusion 52
References 53
-
dc.language.isoen-
dc.subject深度學習zh_TW
dc.subject自動化機器學習zh_TW
dc.subject卷積神經網路zh_TW
dc.subject自動網路架構搜索zh_TW
dc.subject可微分網路架構搜尋zh_TW
dc.subject影像處理zh_TW
dc.subjectImage processingen
dc.subjectDifferentiable architecture searchen
dc.subjectConvolutional Neural Networken
dc.subjectautoMLen
dc.subjectDeep learningen
dc.subjectNeural architecture searchen
dc.title有限條件下最佳網路自動搜索應用於分類網路之研究zh_TW
dc.titleThe gradient-based optimal neural architecture search for classification under constraintsen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee張恆華;宋家驥;謝傳璋zh_TW
dc.contributor.oralexamcommitteeHerng-Hua Chang;Chia-Chi Sung;Chuan-Zhang Xieen
dc.subject.keyword深度學習,自動化機器學習,自動網路架構搜索,可微分網路架構搜尋,卷積神經網路,影像處理,zh_TW
dc.subject.keywordDeep learning,autoML,Neural architecture search,Differentiable architecture search,Convolutional Neural Network,Image processing,en
dc.relation.page54-
dc.identifier.doi10.6342/NTU202303238-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2023-08-11-
dc.contributor.author-college工學院-
dc.contributor.author-dept工程科學及海洋工程學系-
dc.date.embargo-lift2028-08-07-
顯示於系所單位:工程科學及海洋工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
  未授權公開取用
3.62 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved