基於主動學習的高效率關聯式少樣本學習

Chang-Sin Dai; 戴長昕

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70449

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉
dc.contributor.author	Chang-Sin Dai	en
dc.contributor.author	戴長昕	zh_TW
dc.date.accessioned	2021-06-17T04:28:26Z	-
dc.date.available	2028-08-10
dc.date.copyright	2018-08-18
dc.date.issued	2018
dc.date.submitted	2018-08-13
dc.identifier.citation	[1] Marcin Andrychowicz, Misha Denil, Sergio Gomez Colmenarejo, Matthew W. Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. CoRR, abs/1606.04474, 2016. [2] Luca Bertinetto, Jack Valmadre, Jo˜ao F. Henriques, Andrea Vedaldi, and Philip H. S. Torr. Fully-convolutional siamese networks for object tracking. CoRR, abs/1606.09549, 2016. [3] L´eon Bottou. Large-scale machine learning with stochastic gradient descent. In Yves Lechevallier and Gilbert Saporta, editors, Proceedings of COMPSTAT’ 2010, pages 177–186, Heidelberg, 2010. Physica-Verlag HD. [4] S. Chakraborty, V. Balasubramanian, Q. Sun, S. Panchanathan, and J. Ye. Active batch selection via convex relaxations with guaranteed solution bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(10):1945–1958, Oct 2015. [5] Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. Deformable convolutional networks. CoRR, abs/1703.06211, 2017. [6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. [7] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. The MIT Press, 2016. [8] Marti A. Hearst. Support vector machines. IEEE Intelligent Systems, 13(4):18–28, July 1998. [9] Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015. [10] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, November 1997. [11] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167, 2015. [12] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014. [13] G¨unter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Self-normalizing neural networks. CoRR, abs/1706.02515, 2017. [14] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. [15] Brenden M. Lake, Ruslan Salakhutdinov, and Joshua B. Tenenbaum. Humanlevel concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015. [16] Vinod Nair and Geoffrey E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pages 807–814, USA, 2010. Omnipress. [17] Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. Dynamic routing between capsules. CoRR, abs/1710.09829, 2017. [18] Burr Settles. Active Learning. Morgan & Claypool Publishers, 2012. [19] K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv e-prints, September 2014. [20] J. Snell, K. Swersky, and R. S. Zemel. Prototypical Networks for Few-shot Learning. ArXiv e-prints, March 2017. [21] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, and Timothy M. Hospedales. Learning to compare: Relation network for few-shot learning. CoRR, abs/1711.06025, 2017. [22] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. CoRR, abs/1409.4842, 2014. [23] Oriol Vinyals, Charles Blundell, Timothy P. Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. CoRR, abs/1606.04080, 2016. [24] Matthew D. Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. CoRR, abs/1311.2901, 2013. [25] Z. Zhou, J. Shin, L. Zhang, S. Gurudu, M. Gotway, and J. Liang. Fine-tuning convolutional neural networks for biomedical image analysis: Actively and incrementally. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4761–4772, July 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70449	-
dc.description.abstract	近年來，深度學習的發展一日千里，主要原因除了算力增長之外，還包括大量學者投入研究，使得深度學習在如機器視覺、自然語言處理、語音處理等多個領域發光發熱。雖然元學習(Meta learning)和少樣本學習(Few-shot learning)還未能在工業上有效應用，但其理念和創造力走在深度學習的先鋒，發展仍然備受矚目。在本論文中，我們發現主動學習和少樣本學習有著極為相似的理念：同樣是藉由較少量的訓練樣本，前者從樣本質量出發，而後者從樣本數量出發。我們基於主動學習的理念，設計能為少樣本學習任務挑選合適訓練樣本選擇器來優化模型表現。以模擬人類識別過程的關係網路模型，結合主動學習以降低訓練成本，並讓模型學習到更優質的元知識(meta)來處理未知的任務，最後設計實驗來探討方法中各個部分所造成的影響。透過分析實驗結果，我們希望能夠一步步打開少樣本學習在深度結構模型中的黑箱。	zh_TW
dc.description.abstract	In recent years, deep learning is developed in tremendous speed nowsdays. Giving the credits to the increase in computing power, a large number of researchers have invested in research, making deep learning shine in many fields such as machine vision, natural language processing, and speech processing. While meta-learning and few-shot learning have not yet been effectively applied in industry, their ideas and creativity are at the forefront of deep learning, and its development is still attracting attention. In this thesis, We found that active learning and few-shot learning have very similar ideas. Based on the concept of active learning, we design a suitable training sample selector for small sample learning tasks to optimize model performance.we use a relational network model that simulates the human recognition process, combined with active-learning-like method to reduce training costs and make it efficient, and let the model learn better meta knowledge to deal with unknown tasks. We design experiments to explore the effects of various parts of the method. By analyzing the experimental results, we hope to be able to open the black box in the deep structure model with less sample learning.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T04:28:26Z (GMT). No. of bitstreams: 1 ntu-107-R05944041-1.pdf: 1650348 bytes, checksum: 8dd8df0394c1c82d43eeab026b7299c4 (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	Acknowledgments i 摘要 ii Abstract iii List of Figures viii List of Tables ix Chapter 1 Introdcution 1 Chapter 2 Background and Related Work 3 2.1 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Meta Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3.1 Inductive Transfer Learning . . . . . . . . . . . . . . . . . . . 5 2.4 Metric Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . 6 2.5.1 Deformable Convolutional Networks . . . . . . . . . . . . . . . 7 2.6 Self-Normalization Neural Network . . . . . . . . . . . . . . . . . . . 8 2.7 Few-shot Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.7.1 Matching Network for One Shot Learning . . . . . . . . . . . 10 2.7.2 Relation Network for Few-Shot Learning . . . . . . . . . . . . 10 2.7.3 Prototypical Networks for Few-shot Learning . . . . . . . . . . 11 Chapter 3 Methodology 12 3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Model Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.1 Embedding Module . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.2 Relation Module . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.3 Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4 Experiment 19 4.1 Experiment Environment . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.2 Software and Tools . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Omniglot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.1 Training on Omniglot . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.2 Testing on Omniglot . . . . . . . . . . . . . . . . . . . . . . . 21 4.3 MiniImageNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.1 Training on MiniImageNet . . . . . . . . . . . . . . . . . . . . 22 4.3.2 Testing on MiniImageNet . . . . . . . . . . . . . . . . . . . . 22 Chapter 5 Discussion 24 5.1 Relationship to Existing Models . . . . . . . . . . . . . . . . . . . . . 24 5.2 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.1 Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.3.2 SELU and Batch Normalization . . . . . . . . . . . . . . . . . 27 5.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.4.1 Knowledge Distilling . . . . . . . . . . . . . . . . . . . . . . . 27 5.4.2 Capsule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Chapter 6 Conclusion 29 Bibliography 30
dc.language.iso	zh-TW
dc.title	基於主動學習的高效率關聯式少樣本學習	zh_TW
dc.title	Cost-Effective Active Learning for Relational Few-shot Learning	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	蘇中才,蔡芸琤,張智威
dc.subject.keyword	機器學習,深度學習,少樣本學習,元學習,主動學習,	zh_TW
dc.subject.keyword	Machine Learning,Deep Learning,Few-shot Learning,Meta Learning,Active Learning,	en
dc.relation.page	33
dc.identifier.doi	10.6342/NTU201803026
dc.rights.note	有償授權
dc.date.accepted	2018-08-13
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 目前未授權公開取用	1.61 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。