基於深度學習之群眾外包資料學習問題

Wei-Li Kao; 高偉立

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73119

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	盧信銘(Hsin-Min Lu)
dc.contributor.author	Wei-Li Kao	en
dc.contributor.author	高偉立	zh_TW
dc.date.accessioned	2021-06-17T07:18:21Z	-
dc.date.available	2024-07-24
dc.date.copyright	2019-07-24
dc.date.issued	2019
dc.date.submitted	2019-07-10
dc.identifier.citation	REFERENCE Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., & Navab, N. (2016). Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE transactions on medical imaging, 35(5), 1313-1321. Bonner, S. E., Hastie, R., Sprinkle, G. B., & Young, S. M. (2000). A review of the effects of financial incentives on performance in laboratory tasks: Implications for management accounting. Journal of Management Accounting Research, 12(1), 19-64. Cho, J., Lee, K., Shin, E., Choy, G., & Do, S. (2015). How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv preprint arXiv:1511.06348. Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., & Allahbakhsh, M. (2018). Quality control in crowdsourcing: A survey of quality attributes, assessment techniques, and assurance actions. ACM Computing Surveys, 51(1), 7. Dawid, A. P., & Skene, A. M. (1979). Maximum likelihood estimation of observer error‐rates using the EM algorithm. Journal of the Royal Statistical Society: Series C, 28(1), 20-28. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1-22. Dow, S., Kulkarni, A., Klemmer, S., & Hartmann, B. (2012). Shepherding the crowd yields better work. Paper presented at the ACM 2012 Conference on Computer Supported Cooperative Work. Fang, M., Yin, J., & Tao, D. (2014). Active learning for crowdsourcing using knowledge transfer. Paper presented at the Twenty-Eighth AAAI Conference on Artificial Intelligence. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Paper presented at the IEEE conference on Computer Vision and Pattern Recognition. Hsueh, P.-Y., Melville, P., & Sindhwani, V. (2009). Data quality from crowdsourcing: a study of annotation selection criteria. Paper presented at the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing. Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Paper presented at the International Conference on Machine Learning. Jung, H. J., & Lease, M. (2012). Improving quality of crowdsourced labels via probabilistic matrix factorization. Paper presented at the Twenty-Sixth AAAI Conference Workshop on Artificial Intelligence. Kaggle. (2013). Dogs vs. cats competition. Retrieved from: https://www.kaggle.com/c/dogs-vs-cats Kamar, E., Kapoor, A., & Horvitz, E. (2015). Identifying and accounting for task-dependent bias in crowdsourcing. Paper presented at the Third AAAI Conference on Human Computation and Crowdsourcing. Karger, D. R., Oh, S., & Shah, D. (2011). Budget-optimal crowdsourcing using low-rank matrix approximations. Paper presented at the 49th Annual Allerton Conference on Communication, Control, and Computing. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160, 3-24. Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. Retrieved from http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf Li, H., & Liu, Q. (2015). Cheaper and better: Selecting good workers for crowdsourcing. Paper presented at the Third AAAI Conference on Human Computation and Crowdsourcing. Li, H., Zhao, B., & Fuxman, A. (2014). The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. Paper presented at the 23rd International Conference on World Wide Web. Liu, Q., Peng, J., & Ihler, A. T. (2012). Variational inference for crowdsourcing. Paper presented at the Advances in Neural Information Processing Systems. Mason, W., & Watts, D. J. (2009). Financial incentives and the performance of crowds. Paper presented at the ACM SIGKDD Workshop on Human Computation. Rabiner, L. R., & Juang, B.-H. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4-16. Raykar, V. C., Yu, S., Zhao, L. H., Valadez, G. H., Florin, C., Bogoni, L., & Moy, L. (2010). Learning From Crowds. The Journal of Machine Learning Research, 11, 1297-1322. Rodrigues, F., & Pereira, F. C. (2018). Deep learning from crowds. Paper presented at the Thirty-Second AAAI Conference on Artificial Intelligence. Rogstadius, J., Kostakos, V., Kittur, A., Smus, B., Laredo, J., & Vukovic, M. (2011). An assessment of intrinsic and extrinsic motivation on task performance in crowdsourcing markets. Paper presented at the Fifth International AAAI Conference on Weblogs and Social Media. Roh, Y., Heo, G., & Whang, S. E. (2018). A survey on data collection for machine learning: a big data-AI integration perspective. arXiv preprint arXiv:.03402. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Singer, Y., & Mittal, M. (2013). Pricing mechanisms for crowdsourcing markets. Paper presented at the 22nd International Conference on World Wide Web. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958. Von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. Paper presented at the SIGCHI Conference on Human Factors in Computing Systems. Whitehill, J., Wu, T.-f., Bergsma, J., Movellan, J. R., & Ruvolo, P. L. (2009). Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Paper presented at the Advances in Neural Information Processing Systems. Yan, Y., Rosales, R., Fung, G., & Dy, J. G. (2011). Active learning from crowds. Paper presented at the International Conference on Machine Learning. Ye, B., Wang, Y., & Liu, L. (2015). Crowd trust: A context-aware trust model for worker selection in crowdsourcing environments. Paper presented at the IEEE International Conference on Web Services. Zheng, H., Li, D., & Hou, W. (2011). Task design, motivation, and participation in crowdsourcing contests. International Journal of Electronic Commerce, 15(4), 57-88. Zheng, Y., Li, G., Li, Y., Shan, C., & Cheng, R. (2017). Truth inference in crowdsourcing: Is the problem solved? International Conference on Very Large Data Bases, 10(5), 541-552. Zhou, Y., Chen, X., & Li, J. (2014). Optimal PAC multiple arm identification with applications to crowdsourcing. Paper presented at the International Conference on Machine Learning.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73119	-
dc.description.abstract	由於深度學習方法對於標記資料的需求較大，透過群眾外包獲取訓練數據成為了一種方便且具有經濟效益的方式。為了處理收集到的標籤中的雜訊，近年來出現了許多彙整標籤並學習分類器的演算法。這些演算法往往會受到專家偏見(expert bias)以及數據偏差(data bias)的影響。具體而言，每個專家可能具有不同的背景和能力，導致他們犯下特定類型的錯誤，另外資料本身的特徵也可能系統性地對全體專家產生影響。我們基於深度學習，提出了一個可以同時推斷專家偏見以及數據偏差並學習分類器的模型。此外，我們也透過實驗證明提出的方法在模擬的資料以及真實世界資料的不同情境中都有良好的效果。	zh_TW
dc.description.abstract	Due to the large amount of labeled data required for deep learning methods, crowdsourcing has emerged as a convenient and cost-efficient way of obtaining training data. To deal with the noisy nature of the collected labels, many algorithms have been proposed to infer true labels and learn a classifier from the obtained noisy labels. The algorithms are often challenged by the occurrence of expert bias and data bias. Specifically, each expert may have diverse backgrounds and abilities, causing them to make specific types of mistakes, and each example from a given data may have features that systematically shift the majority opinion of experts. We propose a model based on deep learning that simultaneously infers the expert bias, the data bias, and learns a classifier. We empirically show the effectiveness of our model on both simulated and real world datasets under different scenarios.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T07:18:21Z (GMT). No. of bitstreams: 1 ntu-108-R06725021-1.pdf: 2659314 bytes, checksum: 96d6a284c9ced426c0ecf0d3b6d0bf11 (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	CONTENTS 誌謝 2 中文摘要 3 ABSTRACT i CONTENTS ii LIST OF FIGURES iv LIST OF TABLES v Chapter 1 Introduction 1 Chapter 2 Literature Review 6 2.1 Related studies 6 2.1.1 Mechanism Design and Quality Control 6 2.1.2 Worker Selection and Task Allocation 8 2.2 Algorithms for Learning from Crowds 9 2.2.1 Probability-Based Methods 10 2.2.2 Deep Learning-Based Methods 21 Chapter 3 Methodology 24 3.1 Intuition 24 3.2 Model Design 26 3.2.1 Model Structure 26 3.2.2 Training 27 3.3 Training Tips 29 3.3.1 Initialization of parameters 29 3.3.2 Feasibility of parameters 29 3.3.3 Reducing model complexity 30 3.3.4 Accounting for overfitting 33 Chapter 4 Experiments and Results 34 4.1 Baseline Models 34 4.2 Datasets and Experiment Design 35 4.3 Results and Discussion 42 4.4 Additional Experiments 46 Chapter 5 Conclusion and Future Work 48 REFERENCE 49
dc.language.iso	en
dc.subject	群眾外包	zh_TW
dc.subject	專家偏見	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	群眾學習	zh_TW
dc.subject	數據偏差	zh_TW
dc.subject	Crowdsourcing	en
dc.subject	Learning from Crowds	en
dc.subject	Deep Learning	en
dc.subject	Expert Bias	en
dc.subject	Data Bias	en
dc.title	基於深度學習之群眾外包資料學習問題	zh_TW
dc.title	A DEEP LEARNING APPROACH TO LEARNING FROM CROWDS	en
dc.type	Thesis
dc.date.schoolyear	107-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳以錚(Yi-Cheng Chen),簡宇泰(Yu-Tai Chien)
dc.subject.keyword	群眾外包,群眾學習,深度學習,專家偏見,數據偏差,	zh_TW
dc.subject.keyword	Crowdsourcing,Learning from Crowds,Deep Learning,Expert Bias,Data Bias,	en
dc.relation.page	51
dc.identifier.doi	10.6342/NTU201901224
dc.rights.note	有償授權
dc.date.accepted	2019-07-10
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 未授權公開取用	2.6 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。