Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74197
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳宏銘(Homer H. Chen)
dc.contributor.authorShih-Chun Chiuen
dc.contributor.author邱世鈞zh_TW
dc.date.accessioned2021-06-17T08:23:55Z-
dc.date.available2019-08-18
dc.date.copyright2019-08-18
dc.date.issued2019
dc.date.submitted2019-08-13
dc.identifier.citation[1] O. Chapelle, B. Scholkopf, and A. Zien, Semi-Supervised Learning. Cambridge, U.K.: MIT Press, 2006.
[2] M. Sajjadi, M. Javanmardi, and T. Tasdizen, “Regularization with stochastic transformations and perturbations for deep semi-supervised learning,” in Advances in Neural Information Processing Systems, pp. 1163–1171, 2016.
[3] R. Antti, B. Mathias, H. Mikko, V. Harri, and R. Tapani, “Semi-supervised learning with ladder networks,” in Advances in Neural Information Processing Systems, pp. 3546–3554, 2015.
[4] P. Bachman, O. Alsharif, and D. Precup, “Learning with pseudo-ensembles,” in Advances in Neural Information Processing Systems, pp. 3365–3373, 2014.
[5] S. Laine and T. Aila, “Temporal ensembling for semi-supervised learning,” in Fifth International Conference on Learning Representations, 2017.
[6] A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” in Advances in Neural Information Processing Systems, pp. 1195–1204, 2017.
[7] T. Miyato, S. Maeda, S. Ishii, and M. Koyama, “Virtual adversarial training: A regularization method for supervised and semi-supervised learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1979–1993, vol. 41, 2018.
[8] D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in ICML Workshop on Challenges in Representation Learning, 2013.
[9] Y. Grandvalet and Y. Bengio, “Semi-supervised learning by entropy minimization,” in Advances in Neural Information Processing Systems, pp. 529–536, 2005.
[10] A. Oliver, A. Odena, C. Raffel, E. D. Cubuk, and I. J. Goodfellow, “Realistic evaluation of deep semi-supervised learning algorithms,” in Advances in Neural Information Processing Systems, pp. 3235–3246, 2018.
[11] M. Sajjadi, M. Javanmardi, and T. Tasdizen, “Mutual exclusivity loss for semisupervised deep learning,” in IEEE International Conference on Image Processing, pp. 1908–1912, 2016.
[12] C. Dugas, Y. Bengio, F. Belisle, C. Nadeau, and R. Garcia, “Incorporating second-order functional knowledge for better option pricing,” in Advances in Neural Information Processing Systems, pp. 472–478, 2001.
[13] A. Hermans, L. Beyer, and B. Leibe, “In defense of the triplet loss for person re-identification,” arXiv preprint arXiv:1703.07737, 2017.
[14] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823, 2015.
[15] A. Krizhevsky, “Learning multiple layers of features from tiny images,” Technical report, University of Toronto, 2009.
[16] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
[17] S. Zagoruyko and N. Komodakis, “Wide residual networks,” in Proceedings of the British Machine Vision Conference, 2016.
[18] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning, 2015.
[19] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network acoustic models,” in International Conference on Machine Learning, 2013.
[20] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Second International Conference on Learning Representations, 2014.
[21] M. F. Balcan, A. Blum, P. P. Choi, J. Lafferty, B. Pantano, M. R. Rwebangira, and X. Zhu, “Person identification in webcam images: An application of semi-supervised learning,” in Proceedings of International Conference on Machine Learning, Workshop on Learning from Partially Classified Training Data, pp. 1–9, 2005.
[22] X. Liu, M. Song, D. Tao, X. Zhou., C. Chen, and J. Bu, “Semi-supervised coupled dictionary learning for person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3550–3557, 2014.
[23] M. Bauml, M. Tapaswi, and R. Stiefelhagen, “Semi-supervised learning with constraints for person identification in multimedia data,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3602–3609, 2013.
[24] R. Filipovych, and C. Davatzikos, “Semi-supervised pattern classification of medical images: Application to mild cognitive impairment (MCI),” NeuroImage, vol. 55, pp. 1109–1119, 2011.
[25] X. You, Q. Peng, Y. Yuan, Y. M. Cheung, and J. Lei, “Segmentation of retinal blood vessels using the radial projection and semi-supervised approach,” Pattern Recognition, vol. 44, pp. 2314–2324, 2011.
[26] S. Kullback and R. A. Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics, vol. 22, pp. 79–86, 1951.
[27] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, pp. 1097–1105, 2012.
[28] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv preprint arXiv:1207.0580, 2012.
[29] S. Wager, S. Wang, and P. S. Liang, “Dropout training as adaptive regularization,” in Advances in Neural Information Processing Systems, pp. 351–359, 2013.
[30] J. Ba and R. Caruana, “Do deep nets really need to be deep?” In Advances in Neural Information Processing Systems, pp. 2654–2662, 2014.
[31] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
[32] E. Hoffer and N. Ailon, “Deep metric learning using triplet network,” in International Workshop on Similarity-Based Pattern Recognition, pp. 84–92, 2015.
[33] S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan, “A theory of learning from different domains,” Machine learning, vol. 79, pp. 151–175, 2010.
[34] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” The Journal of Machine Learning Research, vol. 17, pp. 2096–2030, 2016.
[35] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in Third International Conference on Learning Representations, 2015.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74197-
dc.description.abstract在半監督式學習(Semi-supervised learning)中,一致性正規化(Consistency regularization)是一個有效且常見的方法。它的目的在於,讓一個機器學習模型,其輸出不會因為對模型進行微小擾動而大受影響。一致性正規化往往透過新增一項損失函數來實現,該項損失函數會試圖最小化兩個模型輸出之間的距離,其中一個輸出是模型未經過擾動的輸出,另一個輸出是模型經過擾動後的輸出。然而,在過去的損失函數中,並未考量到這兩個模型輸出的信心程度。在這篇論文裡,我們提出一個新的損失函數稱作四元損失函數(Quadruplet loss),此損失函數能同時考慮模型的一致性以及模型輸出的質量。具體而言,我們在損失函數中透過計算模型輸出的熵(Entropy),內建了一個考量兩個模型輸出質量的機制,該機制會動態調整損失函數使得其鼓勵模型做出低熵的輸出。因此,模型的參數可以被較可靠地更新。我們將此新的四元損失函數實作在Π-model和virtual adversarial training這兩個最先進的半監督式學習方法上,並使用兩個基準資料集CIFAR-10和SVHN作測試。結果顯示出四元損失函數確實在這兩個資料集上增進了這兩個半監督式學習的成效。除此之外,模型訓練的穩定度也有所提升,在驗證資料(Validation data)缺乏以致難以做模型選擇(Model selection)時,這項性質便相當重要。我們也發現使用四元損失函數能讓這兩個半監督式學習方法在訓練資料只有極少量有標籤的情況下,減緩成效下降的程度。zh_TW
dc.description.abstractConsistency regularization is often employed to enforce that a small perturbation of a semi-supervised learning model does not seriously affect the output of the model. This is achieved by introducing an additional loss (or cost) function to minimize the distance between the outputs of the model before and after the perturbation. However, the confidence level of the two outputs is not considered for the minimization. In this paper, we propose a new loss function called quadruplet loss to take both model consistency and output quality into consideration at the same time. Specifically, an entropy measurement of the output quality is built into the proposed loss function, and an adjustment is made dynamically to encourage low-entropy outputs in each training iteration. As a result, the model parameters are reliably optimized. We test the quadruplet loss on two state-of-the-art semi-supervised learning methods, the Π-model and the virtual adversarial training, using CIFAR-10 and SVHN as benchmark datasets. The results show that the quadruplet loss indeed improves the performance of these two methods and the stability of model training, which is critical when model selection is infeasible because only a limited amount of validation data are available. Moreover, the quadruplet loss prevents models from dramatic performance degradation when the amount of labeled data is small.en
dc.description.provenanceMade available in DSpace on 2021-06-17T08:23:55Z (GMT). No. of bitstreams: 1
ntu-108-R06942071-1.pdf: 1502787 bytes, checksum: a69e25ca9761e8ee5048866853f5c228 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents口試委員會審定書………………………………………………………………. i
誌謝………………………………………………………………………………. ii
中文摘要…………………………………………………………………………. iii
ABSTRACT……………………………………………………………………… iv
目錄………………………………………………………………………………. v
圖目錄……….…………………………………………………………………… vi
表目錄….………………………………………………………………………… vii
Chapter 1 Introduction…………………………………………………………. 1
Chapter 2 Related Work………………………………………………………… 3
2.1 Consistency-Regularization-Based SSL…………………………………. 3
2.2 Entropy-Minimization-Based SSL………………………………………. 4
Chapter 3 Conventional Consistency Loss……………………………………… 6
Chapter 4 Proposed Quadruplet Loss…………………………………………… 7
Chapter 5 Gradient Analysis……..……………………………………………… 9
5.1 Gradients of Conventional Consistency Loss……………………………. 9
5.2 Gradients of Proposed Quadruplet Loss…………………………………. 11
5.3 Effect of Softplus Function……………………………………………… 13
Chapter 6 Experiments………………………………………………………….. 15
6.1 Efficacy of Quadruplet Loss on Benchmark Datasets…………………… 18
6.2 Influence of the Number of Labeled Samples…………………………… 21
6.3 Effectiveness of α and Softplus Function………………………………… 23
Chapter 7 Conclusions…………………………………………………………… 25
REFERENCES……………………………………………………………………. 27
dc.language.isoen
dc.subject半監督式學習zh_TW
dc.subject損失函數zh_TW
dc.subject一致性正規化zh_TW
dc.subjectConsistency regularizationen
dc.subjectloss functionen
dc.subjectsemi-supervised learningen
dc.title半監督式學習中利用四元損失函數進行一致性正規化zh_TW
dc.titleQuadruplet Loss for Consistency Regularization in Semi-Supervised Learningen
dc.typeThesis
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.oralexamcommittee楊奕軒(Yi-Hsuan Yang),蔡銘峰(Ming-Feng Tsai),施光祖(Kuang-Tsu Shih)
dc.subject.keyword一致性正規化,損失函數,半監督式學習,zh_TW
dc.subject.keywordConsistency regularization,loss function,semi-supervised learning,en
dc.relation.page30
dc.identifier.doi10.6342/NTU201903057
dc.rights.note有償授權
dc.date.accepted2019-08-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
1.47 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved