有限條件下最佳網路自動搜索應用於分類網路之研究

周宥辰; Yu-CHen Chou

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89141

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁肇隆	zh_TW
dc.contributor.advisor	Chao-Lung Ting	en
dc.contributor.author	周宥辰	zh_TW
dc.contributor.author	Yu-CHen Chou	en
dc.date.accessioned	2023-08-16T17:18:06Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-08-16	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-10	-
dc.identifier.citation	1. Krizhevsky, A., I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012. 25. 2. Simonyan, K. and A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 3. Szegedy, C., et al. Going deeper with convolutions. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. 4. Hubel, D.H. and T.N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 1962. 160. 5. Fukushima, K., Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980. 36: p. 193-202. 6. LeCun, Y., L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998. 86(11): p. 2278-2324. 7. He, K., X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. 8. Zoph, B. and Q.V. Le, Neural Architecture Search with Reinforcement Learning. ArXiv, 2016. abs/1611.01578. 9. Zoph, B., V. Vasudevan, J. Shlens, and Q.V. Le. Learning transferable architectures for scalable image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. 10. Liu, H., et al., Hierarchical Representations for Efficient Architecture Search. ArXiv, 2017. abs/1711.00436. 11. Andrychowicz, M., et al., Learning to learn by gradient descent by gradient descent. Advances in neural information processing systems, 2016. 29. 12. Wang, J., J. Wu, H. Bai, and J. Cheng. M-nas: Meta neural architecture search. in Proceedings of the AAAI Conference on Artificial Intelligence. 2020. 13. Liu, H., K. Simonyan, and Y. Yang, Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018. 14. Chen, X., L. Xie, J. Wu, and Q. Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. in Proceedings of the IEEE/CVF international conference on computer vision. 2019. 15. Sutskever, I., J. Martens, G. Dahl, and G. Hinton. On the importance of initialization and momentum in deep learning. in International conference on machine learning. 2013. PMLR. 16. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 17. He, K., X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. in Proceedings of the IEEE international conference on computer vision. 2015. 18. 曾子芸, 有限條件下的最佳網路架構用於分類之研究, in 工程科學及海洋工程學研究所. 2021, 國立臺灣大學. p. 1-50.datasets from ImageNet than the baseline AlexNet architecture.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89141	-
dc.description.abstract	近年來，深度學習在許多領域，如影像處理、自然語言處理等方面取得了優秀的成果。然而，要設計出優秀的神經網絡架構，高度地需要該領域的知識和反覆實驗，這使得許多研究人員對機器學習望而卻步。為了簡化這一過程，神經架構搜索（NAS）得流行起來。我們首先以 AlexNet 的網路架構來深入了解不同初始化權重的方法對於網路架構性能的影響，這讓我們對於定義搜索空間提供了更深入的調查，均勻分佈初始化方法配合適當的上下界能夠產生出優異的結果。之後，我們探討卷積核尺寸(kernel size)對於模型效能的影響， 3x3 卷積核尺寸效能比大卷積核尺寸的效能還佳，所以我們以 3x3 的卷積核尺寸作為下一階段實驗的基礎。最後我們定義一個廣的模型，透過 NAS 來找到他的最佳子網路，當作重新訓練之網路架構，最佳子網路均有著更佳的準確率，以及更少的參數量。總結而言，我們測試了多組資料集，以及多種不同的隨機種子，證明所提出的算法能夠自動搜索出在特定資料集中，比基準AlexNet 架構更輕量且準確的最佳架構。	zh_TW
dc.description.abstract	In recent years, deep learning has achieved outstanding results in many fields, such as computer vision, natural language processing. However, the process of designing an exceptional neural network architecture highly requires the domain knowledge and repeated experiments, which keeps the many researchers away from the machine learning approach. In order to simplify this process, neural architecture search (NAS) has become popular recently. We first delve into the significance of initialization methods on the weights of AlexNet and the kernel size in the convolutional layer, which provides us a further insight to define our search space. The initialization method of uniform distribution with a proper upper bound and lower bound is able to produce exceptional results. Then we investigate the importance of kernel size. Notably, the adoption of a 3×3 kernel size operation as the sole operation within a cell proves to be highly prominent. Last but not least, our proposed algorithm can automatically search optimal architecture which is more light-weight and accurate on specific datasets from ImageNet than the baseline AlexNet architecture.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-16T17:18:06Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-16T17:18:06Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii Abstract iii List of Figures vi List of Tables ix 1 Introduction 1 1.1 Motivation 2 1.2 Thesis overview 2 2 Related work 4 2.1 Convolutional neural network 4 2.2 Neural architecture search 6 3 Proposed methods 8 3.1 Environmental settings 8 3.1.1 Hardware and software 9 3.1.2 Datasets 9 3.1.3 Optimizers 12 3.1.4 Batch size and the number of epochs 12 3.1.5 Data preprocessing 12 3.1.6 Evaluation 13 3.2 Initialization methods 14 3.3 Operation choice 14 3.3.1 General architecture 15 3.3.2 Subgraph connection 16 3.3.3 Training protocol 17 3.3.4 Pick the model by its highest validation accuracy 19 3.4 Proposed NAS algorithm 20 3.4.1 NAS methods 21 3.4.2 Training algorithm 24 4 Evaluation 26 4.1 Different initialization methods 26 4.2 Operation choice 30 4.2.1 Architecture “cell0_4.cell4_5” 30 4.2.2 Architecture “cell0_1.cell1_2.cell2_5” 34 4.2.3 Pick the model by its maximum validation accuracy 36 4.3 NAS evaluation 43 5 Conclusion 52 References 53	-
dc.language.iso	en	-
dc.subject	深度學習	zh_TW
dc.subject	自動化機器學習	zh_TW
dc.subject	卷積神經網路	zh_TW
dc.subject	自動網路架構搜索	zh_TW
dc.subject	可微分網路架構搜尋	zh_TW
dc.subject	影像處理	zh_TW
dc.subject	Image processing	en
dc.subject	Differentiable architecture search	en
dc.subject	Convolutional Neural Network	en
dc.subject	autoML	en
dc.subject	Deep learning	en
dc.subject	Neural architecture search	en
dc.title	有限條件下最佳網路自動搜索應用於分類網路之研究	zh_TW
dc.title	The gradient-based optimal neural architecture search for classification under constraints	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	張恆華;宋家驥;謝傳璋	zh_TW
dc.contributor.oralexamcommittee	Herng-Hua Chang;Chia-Chi Sung;Chuan-Zhang Xie	en
dc.subject.keyword	深度學習,自動化機器學習,自動網路架構搜索,可微分網路架構搜尋,卷積神經網路,影像處理,	zh_TW
dc.subject.keyword	Deep learning,autoML,Neural architecture search,Differentiable architecture search,Convolutional Neural Network,Image processing,	en
dc.relation.page	54	-
dc.identifier.doi	10.6342/NTU202303238	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2023-08-11	-
dc.contributor.author-college	工學院	-
dc.contributor.author-dept	工程科學及海洋工程學系	-
dc.date.embargo-lift	2028-08-07	-
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	3.62 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。