基於成對比較和置信傳播的同儕互評系統

Wei-Chih Chen; 陳唯之

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17310

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	葉丙成(Ping-Cheng Yeh)
dc.contributor.author	Wei-Chih Chen	en
dc.contributor.author	陳唯之	zh_TW
dc.date.accessioned	2021-06-08T00:06:10Z	-
dc.date.copyright	2020-08-21
dc.date.issued	2020
dc.date.submitted	2020-08-07
dc.identifier.citation	[1] Coursera. https://www.coursera.org. Accessed: 2020-02-01. [2] Khan academy. https://www.khanacademy.org. Accessed: 2020-02-01. [3] C. Berge. Two theorems in graph theory. Proceedings of the National Academy of Sciences, 43(9):842–844, 1957. [4] C. Berrou, A. Glavieux, and P. Thitimajshima. Near shannon limit error-correcting coding and decoding: Turbo-codes. 1. In Proceedings of ICC ’93 - IEEE International Conference on Communications, volume 2, pages 1064–1070 vol.2, May 1993. [5] B. S. Bloom et al. Taxonomy of educational objectives. vol. 1: Cognitive domain. New York: McKay, pages 20–24, 1956. [6] R. A. Bradley and M. E. Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952. [7] E. C. Brewer. A Dictionary of Phrase and Fable. Cassell, limited, 1923. [8] A. I. V. Casado, M. Griot, and R. D. Wesel. Improving ldpc decoders via informed dynamic scheduling. In 2007 IEEE Information Theory Workshop, pages 208–213. IEEE, 2007. [9] A. I. V. Casado, M. Griot, and R. D. Wesel. Informed dynamic scheduling for belief propagation decoding of ldpc codes. In 2007 IEEE International Conference on Communications, pages 932–937. IEEE, 2007. [10] K. Cho, C. D. Schunn, and R. W. Wilson. Validity and reliability of scaffolded peer assessment of writing from instructor and student perspectives. Journal of Educational Psychology, pages 891–901, 2006. [11] Christoph Gerlach. A pairing program for go tournaments. https://www.cgerlach.de/go/macmahon.html, 1994. [12] R. Comroe and D. Costello. Arq schemes for data transmission in mobile radio systems. IEEE Journal on Selected Areas in Communications, 2(4):472–481, July 1984. [13] Daniel Saunders. Weighted maximum matching in general graphs. https://www.mathworks.com/matlabcentral/fileexchange/42827-weighted-maximum-matching-in-general-graphs), 2020. MATLAB Central File Exchange. Retrieved February 14, 2020. [14] J. Edmonds. Paths, trees, and flowers. Canadian Journal of Mathematics, 17:449–467, 1965. [15] G. Elidan, I. McGraw, and D. Koller. Residual belief propagation: Informed scheduling for asynchronous message passing. arXiv preprint arXiv:1206.6837, 2012. [16] FIDE: Fédération Internationale des Échecs. Fide handbook: C.04 fide swiss rules. https://handbook.fide.com/?id=18 view=category. Accessed: 2020-02-01. [17] L. R. Ford. Solution of a ranking problem from binary comparisons. The American Mathematical Monthly, 64(8):28–33, 1957. [18] M. Freeman and J. McKenzie. Spark, a confidential web–based template for self and peer assessment of student teamwork: benefits of evaluating across different subjects. British Journal of Educational Technology, 33(5):551–569, 2002. [19] Z. Galil. Efficient algorithms for finding maximum matching in graphs. ACM Computing Surveys (CSUR), 18(1):23–38, 1986. [20] R. Gallager. Low-density parity-check codes. IRE Transactions on Information Theory, 8(1):21–28, January 1962. [21] J. C. Hafner and P. M. Hafner. Quantitative analysis of the rubric as an assessment tool: an empirical study of student peer-group rating. International Journal of Science Education, 25(12):1509–1528, Dec. 2003. [22] F. Hollands and D. Tirthali. Moocs: Expectations and reality. Center for Benefit-Cost Studies of Education, Teachers College, Columbia University, New York, NY, 01 2014. [23] D. Hunter and K. Lange. Optimization transfer using surrogate objective functions- rejoinder. Journal of Computational and Graphical Statistics, 9:52–59, 03 2000. [24] D. R. Hunter. Mm algorithms for generalized bradley-terry models. Ann. Statist., 32(1):384–406, 02 2004. [25] John Duhring. Massive mooc grading problem – stanford hci group tackles peer assessment. http://moocnewsandreviews.com/massive-mooc-grading-problem-stanford-hci-group-tackles-peer-assessment/, 2013. [26] S. Johnson. Introducing low-density parity-check codes. 05 2010. [27] Joris van Rantwijk. Maximum weighted matching. http://jorisvr.nl/article/maximum-matching, 2008. [28] M. G. KENDALL. A NEW MEASURE OF RANK CORRELATION. Biometrika, 30(1-2):81–93, 06 1938. [29] M. G. Kendall and B. B. Smith. On the method of paired comparisons. Biometrika, 31(3/4):324–345, 1940. [30] J.-H. Kim, M.-Y. Nam, and H.-Y. Song. Variable-to-check residual belief propagation for ldpc codes. Electronics Letters, 45(2):117–119, 2009. [31] D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009. [32] C. J. Lee, C. R. Sugimoto, G. Zhang, and B. Cronin. Bias in peer review. Journal of the American Society for Information Science and Technology, 64(1):2–17, 2013. [33] H. Luo, A. Robinson, and J.-Y. Park. Peer grading in a mooc: Reliability, validity, and perceived effects. Online Learning Journal, 18(2), June 2014. [34] D. J. MacKay and R. M. Neal. Near shannon limit performance of low density parity check codes. Electronics Letters, 32:1645–1646, 1996. [35] N. Metropolis and S. Ulam. The monte carlo method. Journal of the American Statistical Association, 44(247):335–341, 1949. PMID: 18139350. [36] B. Moskal and J. Leydens. Scoring rubric development: Validity and reliability. Practical Assessment Research and Evaluation, 7, 01 2000. [37] D. Nicol and D. Macfarlane. Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31:199–218, 05 2006. [38] S. Ólafsson. Weighted matching in chess tournaments. Journal of the Operational Research Society, 41(1):17–24, Jan 1990. [39] W. Orcutt and U. S. L. T. Association. Official Lawn Tennis Bulletin. Number 第2卷. The Editors, 1895. [40] J. Pearl. Reverend bayes on inference engines: a distributed hierarchical approach. In in Proceedings of the National Conference on Artificial Intelligence, pages 133–136, 1982. [41] J. Pearl. Probabilistic reasoning in intelligent systems - networks of plausible inference. In Morgan Kaufmann series in representation and reasoning, 1988. [42] C. Piech, J. Huang, Z. Chen, C. Do, A. Ng, and D. Koller. Tuned models of peer assessment in moocs. arXiv preprint arXiv:1307.2579, 2013. [43] Rhett Mcdaniel. Getting to know coursera: Peer assessment. https://cft.vanderbilt.edu/2013/01/getting-to-know-coursera-peer-assessments/, 2013. [44] J. Sebba, R. Crick, G. Yu, H. Lawson, W. Harlen, and K. Durant. Systematic review of research evidence of the impact on students in secondary schools of self and peer assessment. Technical report, 2008. [45] N. B. Shah, J. K. Bradley, A. Parekh, M. Wainwright, and K. Ramchandran. A case for ordinal peer-evaluation in moocs. In NIPS Workshop on Data Driven Education, pages 1–8, 2013. [46] A. Shev, K. Fujii, F. Hsieh, and B. McCowan. Systemic testing on bradley-terry model against nonlinear ranking hierarchy. PloS one, 9(12), 2014. [47] V. J. Shute. Focus on formative feedback. Review of Educational Research, 78(1):153–189, 2008. [48] L. A. Stefani. Peer, self and tutor assessment: Relative reliabilities. Studies in Higher Education, 19(1):69–75, 1994. [49] T. Ta. A tutorial on low density parity-check codes. 2009. [50] The Quality Assurance Agency for Higher Education. Code of Practice for the assurance of academic quality and standards in higher education. Section 6: Assessment of students. Mansfield: QAA, 2006. [51] L. Thustone. A law of comparative judgement. Pschological Review, 34:273–286, 1927. [52] K. J. Topping. Peer assessment between students in colleges and universities. Review of Educational Research, 68(3):249–276, 1998. [53] K. J. Topping. Peer assessment. Theory Into Practice, 48(1):20–27, 2009. [54] 粘庭睿. 線上同儕回饋系統: 同儕回饋分配演算法之開發與研究. Master’s thesis, 臺灣大學, Jan 2017. [55] 黃柏勳. 基於成對比較之開放式問題同儕評量系統= peer evaluation system for open-ended questions based on pairwise comparisons / 黃柏勳(po hsun huang) 撰, 民106[2017].
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17310	-
dc.description.abstract	在本篇論文中，探討使用成對比較（Pairwise Comparison）的同儕互評系統（Peer Evaluation System），並減少以Bradley-Terry模型預測排名所需觀測的成對比較結果數，不同於固定或隨機取樣，我們提出了以最大權重匹配（Maximum weighted matching）實作瑞士制循環賽(Swiss-system tournament)的取樣策略，並實驗證明此方法能有效減少需觀測的資料量，且提升排名準確率。Bradley-Terry模型是一個用來預測成對比較結果的機率模型，也可以用來預測個體排名。我們也觀察到Bradley-Terry 模型，在有人誤評的情況下，會將錯誤的成對比較結果納入考量並迭代，進而推論出一個較不準確的預測排名。成對比較的結果暗示了彼此之間的優劣關係，除了新的配對取樣策略外，我們也提出一個全新的想法，將成對比較的輸贏結果轉換成0和1，承接低密度同位元檢查碼（Low-Density Parity-Check code；LDPC code）的解碼技巧，利用置信傳播（Belief Propagation）在二分圖（Bipartite Graph）上糾正可能的錯誤成對比較結果，進而使用Bradley-Terry 模型預測更好的排名結果。相較於舊有的演算法，本論文的提出方法有助於減輕每個人需負擔的評鑑數量，並能根據現有資料，動態產生未來排名。	zh_TW
dc.description.abstract	This thesis is task-oriented for increasing the predicted ranking accuracy of pairwise comparison by Bradley-Terry model, a probability model can predict possible rankings and results of pairwise comparisons, on peer evaluation system. In order to reduce the required observations for Bradley-Terry model, we propose a Swiss-system pairing strategy based on maximum weighted matching algorithm which can effectively reduce the observations and predict a more accurate ranking. Moreover, inspired by Low-Density Parity-Check (LDPC) codes, we propose an error-correcting algorithm with belief propagation on bipartite graph, which is specifically tailored for pairwise comparison. We notice that when students do the incorrect evaluations, the Bradley-Terry model will consider the misjudgments thus predicting an inaccurate ranking. Since the results of pairwise comparisons imply the ordering between each individual, in addition to the Swiss system, we transform the observation of pairwise comparisons into binary received codes and perform the error-correcting algorithm. By using iterative belief propagation techniques on the bipartite graph, information of the observations is passed to the belonging check node which filters out all the inconsistent ordering states, finally recover the comparisons and obtain a better predicted ranking by Bradley-Terry model.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T00:06:10Z (GMT). No. of bitstreams: 1 U0001-0608202012222600.pdf: 2197679 bytes, checksum: 7e8eed4dae910fb14536daf75fb54338 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	誌謝iii 摘要v Abstract vii 1 Introduction 1 1.1 The booming of online learning . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The challenge in education platforms . . . . . . . . . . . . . . . . . . . . 2 1.3 Peer evaluation system based on pairwise comparison . . . . . . . . . . . 3 1.4 Research purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Related Work 7 2.1 Related work of peer assessment . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Advantage of pairwise comparison in peer assessment . . . . . . . . . . . 9 2.3 Bradley-Terry model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Definition of Bradley-Terry model . . . . . . . . . . . . . . . . . 10 2.3.2 MM algorithm for Bradley-Terry model . . . . . . . . . . . . . . 11 2.4 Peer-evaluation system based on pairwise comparison . . . . . . . . . . . 13 2.4.1 Bradley-Terry model on peer-evaluation system . . . . . . . . . . 13 2.5 Brief introduction of LDPC code . . . . . . . . . . . . . . . . . . . . . . 17 2.5.1 Error-correcting procedures . . . . . . . . . . . . . . . . . . . . 18 2.5.2 Representation of Low Density Parity Check Code . . . . . . . . 18 2.5.3 Belief propagation decoding . . . . . . . . . . . . . . . . . . . . 20 3 Pairing Strategy 25 3.1 Discussion on pairing strategies . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Round Robin tournament . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Swiss-system tournament . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.1 Maximum weighted matching in general graph . . . . . . . . . . 29 3.3.2 Apply maximum weighted matching in Swiss System tournament 32 3.4 Comparison of Round Robin and Swiss system . . . . . . . . . . . . . . 34 3.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4 Error-Correcting On Pairwise Comparison 41 4.1 Transformation of observed comparison results . . . . . . . . . . . . . . 41 4.1.1 Error states of observations . . . . . . . . . . . . . . . . . . . . . 42 4.2 Message passing algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2.1 Residual belief propagation . . . . . . . . . . . . . . . . . . . . 45 4.3 Overview of the proposed bipartite graph H . . . . . . . . . . . . . . . . 48 4.3.1 Arrangement of check nodes . . . . . . . . . . . . . . . . . . . . 49 4.4 Initial estimate of variable nodes . . . . . . . . . . . . . . . . . . . . . . 50 4.4.1 Derivation of initial estimate of observed pairwise comparison . . 51 4.4.2 Derivation of initial estimate of observed triple-wise comparison . 52 4.5 Pairing strategy of comparison . . . . . . . . . . . . . . . . . . . . . . . 56 4.5.1 Design of the proposed bipartite graph H . . . . . . . . . . . . . 57 4.6 Simulation of the difference of adding belief decoder . . . . . . . . . . . 61 4.6.1 Result analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5 Error Correcting On Pairwise Comparison: An ARQ Version 69 5.1 The influence of inconsistent triads . . . . . . . . . . . . . . . . . . . . . 69 5.2 Detection approach of the possible errors in pairwise comparisons . . . . 70 5.3 Design of the proposed bipartite graph H . . . . . . . . . . . . . . . . . 72 5.4 Simulation of the ARQ approach . . . . . . . . . . . . . . . . . . . . . . 75 5.4.1 Experiment setting . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.4.2 Experiment result and analysis . . . . . . . . . . . . . . . . . . . 76 6 Conclusion and Future Work 81 6.1 Swiss system pairing strategy . . . . . . . . . . . . . . . . . . . . . . . . 81 6.2 Error Correcting On Pairwise Comparisons . . . . . . . . . . . . . . . . 82 6.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Appendices 85 A Proof of total wins and loses are sufficient statistics for Bradley-Terry model 85 Bibliography 89
dc.language.iso	en
dc.title	基於成對比較和置信傳播的同儕互評系統	zh_TW
dc.title	Development of the peer-evaluation system based on pairwise comparison and belief propagation	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	賴以威(I-Wei Lai),孔令傑(Ling-Chieh Kung)
dc.subject.keyword	同儕互評系統,Bradley-Terry 模型,成對比較,低密度同位元檢查碼,置信傳播,二分圖,最大權重匹配,瑞士制循環賽,	zh_TW
dc.subject.keyword	pairwise comparison,Bradley-Terry model,Swiss-system tournament,maximum weighted matching,peer evaluation system,Low-Density Parity-Check (LDPC) code,belief propagation,bipartite graph,	en
dc.relation.page	94
dc.identifier.doi	10.6342/NTU202002522
dc.rights.note	未授權
dc.date.accepted	2020-08-07
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
U0001-0608202012222600.pdf 目前未授權公開取用	2.15 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。