互補標籤學習之代理損失架構研究

Yu-Ting Chou; 周侑廷

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52787

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林軒田(Hsuan-Tien Lin)
dc.contributor.author	Yu-Ting Chou	en
dc.contributor.author	周侑廷	zh_TW
dc.date.accessioned	2021-06-15T16:27:38Z	-
dc.date.available	2020-09-02
dc.date.copyright	2020-09-02
dc.date.issued	2020
dc.date.submitted	2020-08-16
dc.identifier.citation	[1] H. Bao, G. Niu, and M. Sugiyama. Classification from pairwise similarity and unla- beled data. In ICML, 2018. [2] Y. Cao and Y. Xu. Multi-complementary and unlabeled learning for arbitrary losses and models. CoRR, abs/2001.04243, 2020. [3] O.Chapelle,B.Scholkopf,andA.Zien.Semi-supervisedlearning(chapelle,o.etal., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3):542–542, 2009. [4] T. Clanuwat, M. Bober-Irizar, A. Kitamoto, A. Lamb, K. Yamamoto, and D. Ha. Deep learning for classical japanese literature. arXiv preprint arXiv:1812.01718, 2018. [5] M. du Plessis, G. Niu, and M. Sugiyama. Convex formulation for learning from positive and unlabeled data. In ICML, pages 1386–1394, 2015. [6] M. C. du Plessis, G. Niu, and M. Sugiyama. Analysis of learning from positive and unlabeled data. In NeurIPS, 2014. [7] C. Elkan and K. Noto. Learning classifiers from only positive and unlabeled data. In KDD, 2008. [8] L.Feng,T.Kaneko,B.Han,G.Niu,B.An,andM.Sugiyama.Learningwithmultiple complementary labels. In ICML, 2020. [9] B. Han, J. Yao, G. Niu, M. Zhou, I. Tsang, Y. Zhang, and M. Sugiyama. Masking: A new perspective of noisy supervision. In NeurIPS, 2018. [10] B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. Tsang, and M. Sugiyama. Co- teaching: Robust training of deep neural networks with extremely noisy labels. In NeurIPS, 2018. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. [12] Y.-G. Hsieh, G. Niu, and M. Sugiyama. Classification from positive, unlabeled and biased negative data. In ICML, 2019. [13] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In CVPR, 2017. [14] T. Ishida, G. Niu, W. Hu, and M. Sugiyama. Learning from complementary labels. In NeurIPS, 2017. [15] T. Ishida, G. Niu, A. Menon, and M. Sugiyama. Complementary-label learning for arbitrary losses and models. In ICML, 2019. [16] T. Ishida, G. Niu, and M. Sugiyama. Binary classification from positive-confidence data. In NeurIPS, 2018. [17] R. Jin and Z. Ghahramani. Learning with multiple labels. In NeurIPS, 2002. [18] T. Kaneko, I. Sato, and M. Sugiyama. Online multiclass classification based on prediction margin for partial feedback. arXiv preprint arXiv:1902.01056, 2019. [19] Y. Kim, J. Yim, J. Yun, and J. Kim. Nlnl: Negative learning for noisy labels. In ICCV, 2019. [20] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. 2015. [21] R. Kiryo, G. Niu, M. C. du Plessis, and M. Sugiyama. Positive-unlabeled learning with non-negative risk estimator. In NeurIPS, 2017. [22] A. Krizhevsky. Learning multiple layers of features from tiny images. University of Toronto, 05 2012. [23] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. [24] N. Lu, G. Niu, A. K. Menon, and M. Sugiyama. On the minimal supervision for training any binary classifier from only unlabeled data. In ICLR, 2019. [25] N. Lu, T. Zhang, G. Niu, and M. Sugiyama. Mitigating overfitting in supervised classification from two unlabeled datasets: A consistent risk correction approach. In AISTATS, 2020. [26] V. Nagarajan and J. Z. Kolter. Uniform convergence may be unable to explain gen- eralization in deep learning. In NeurIPS, 2019. [27] N. Natarajan, I. S. Dhillon, P. K. Ravikumar, and A. Tewari. Learning with noisy labels. In NeurIPS, 2013. [28] G.Niu,M.C.duPlessis,T.Sakai,Y.Ma,andM.Sugiyama.Theoreticalcomparisons of positive-unlabeled learning against positive-negative learning. In NeurIPS, 2016. [29] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Rai- son, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Py- torch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. [30] G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, and L. Qu. Making deep neural networks robust to label noise: A loss correction approach. In ICCV, 2017. [31] T.Sakai,M.C.duPlessis,G.Niu,andM.Sugiyama.Semi-supervisedclassification based on classification from positive and unlabeled data. In ICML, 2017. [32] T. Sakai, G. Niu, and M. Sugiyama. Semi-supervised auc optimization based on positive-unlabeled learning. Machine Learning, 107(4):767–794, 2018. [33] V. Vapnik. Principles of risk minimization for learning theory. In NeurIPS, 1992. [34] X. Xia, T. Liu, N. Wang, B. Han, C. Gong, G. Niu, and M. Sugiyama. Are anchor points really indispensable in label-noise learning? In NeurIPS, 2019. [35] H. Xiao, K. Rasul, and R. Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017. [36] Y. Xu, M. Gong, J. Chen, T. Liu, K. Zhang, and K. Batmanghelich. Generative- discriminative complementary learning. 2020. [37] X.Yu,B.Han,J.Yao,G.Niu,I.W.Tsang,andM.Sugiyama.Howdoesdisagreement help generalization against label corruption? In ICML, 2019. [38] X. Yu, T. Liu, M. Gong, and D. Tao. Learning with biased complementary labels. In ECCV, 2018. [39] C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals. Understanding deep learn- ing requires rethinking generalization. In ICLR, 2017. [40] Z.-H. Zhou. A brief introduction to weakly supervised learning. National Science Review, 5(1):44–53, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52787	-
dc.description.abstract	在弱監督學習領域中，無偏風險估計量(unbiased risk estimator)在訓練資料與測試資料分佈不一致的情況下常被用於訓練分類器。然而，在複雜的模型中無偏風險估計量經常發生過擬合現象，例如深度學習的情境。本論文針對弱監督學習中的互補標籤學習問題(learning with complementary labels)進行探討，分析無偏風險估計量過擬合的成因。透過一系列實驗分析，我們發現即使無偏風險估計量能產生無偏梯度，但產生的梯度方向與目標梯度方向有顯著差距，且變異數值大。本論文提出一個新的代理損失架構(Surrogate Complementary Loss，簡稱SCL)，針對互補標籤學習問題使用最小似然估計設計損失函數，有效降低梯度變異並使梯度方向更接近目標梯度方向。實驗數據顯示SCL方法有效降低過擬合現象，並在多個資料集達成更高的分類準確度。	zh_TW
dc.description.abstract	In weakly supervised learning, unbiased risk estimator (URE) is a powerful tool for training classifiers when training and test data are drawn from different distributions. Nevertheless, UREs lead to overfitting in many problem settings when the models are complex like deep networks. In this thesis, we investigate reasons for such overfitting by studying a weakly supervised problem called learning with complementary labels. We argue the quality of gradient estimation matters more in risk minimization. Theoretically, we show that a URE gives an unbiased gradient estimator (UGE). Practically, however, UGEs may suffer from huge variance, which causes empirical gradients to be usually far away from true gradients during minimization. To this end, we propose a novel surrogate complementary loss (SCL) framework that trades zero bias with reduced variance and makes empirical gradients more aligned with true gradients in the direction. Thanks to this characteristic, SCL successfully mitigates the overfitting issue and improves URE-based methods.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T16:27:38Z (GMT). No. of bitstreams: 1 U0001-0608202006243200.pdf: 2156581 bytes, checksum: a2167447ffe44712a52a2832839db32f (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	口試委員會審定書 ii 誌謝 iii 摘要 iv Abstract v 1 Introduction 1 1.1 Complementary Labels 1 1.2 Empirical Risk Minimization 2 1.2.1 Learning with Ordinary Labels 2 1.2.2 Learning with Complementary Labels 3 1.3 Motivation 5 1.4 Contributions 6 1.5 Overview 7 1.6 Experiment Settings 8 2 Problems of Unbiased Risk Estimators 9 2.1 Overfitting and Negative Risk 9 2.2 Experiments 11 3 Proposed Method 13 3.1 Minimum Likelihood Estimator 13 3.2 Surrogate Complementary Loss(SCL) 15 3.3 Experiments 17 4 Discussion 19 4.1 Gradient Direction Analysis 19 4.2 Bias-Variance Tradeoff 21 5 Conclusion 28 Bibliography 29
dc.language.iso	en
dc.subject	代理損失	zh_TW
dc.subject	互補標籤	zh_TW
dc.subject	Complementary Label	en
dc.subject	Surrogate Loss	en
dc.title	互補標籤學習之代理損失架構研究	zh_TW
dc.title	A New Surrogate Loss Framework for Complementary Learning	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳縕儂(Yun-Nung Chen),陳尚澤(Shang-Tse Chen)
dc.subject.keyword	互補標籤,代理損失,	zh_TW
dc.subject.keyword	Complementary Label,Surrogate Loss,	en
dc.relation.page	32
dc.identifier.doi	10.6342/NTU202002508
dc.rights.note	有償授權
dc.date.accepted	2020-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-0608202006243200.pdf 未授權公開取用	2.11 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。