潛在特質模型錯排分派量測下評分者間信度之探討

Shan-Pang Liu; 劉繕榜

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43752

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳宏(Hung Chen)
dc.contributor.author	Shan-Pang Liu	en
dc.contributor.author	劉繕榜	zh_TW
dc.date.accessioned	2021-06-15T02:27:40Z	-
dc.date.available	2009-08-19
dc.date.copyright	2009-08-19
dc.date.issued	2009
dc.date.submitted	2009-08-17
dc.identifier.citation	Agresti, A. (2002), Categorical Data Analysis, New Jersey: John Wiley & Sons,Inc.,2 edition. Bickel, P.J. and Doksum, K.A. (2001), Mathematical Statistics:Basic ideas and selected topics, New Jersey: Prentice-Hall,Inc., 2 edition. DeCarlo, L.T. (2005), 'A model of rater behavior in essay grading based on signal detection theory', Journal of Educational Measurement, 42(1), 53-76. Engelhard, G., Jr., Gordon, B., and Curtin, D. (1994), 'Constructing rater and writing task banks for the assessment of written composition', Earlier version of a paper presented at the Annual Meeting of the American Educational Research Association. Fleiss, J.L., Levin, B., and Paik, M.C. (2003), Statistical Methods for Rates and Proportions, New Jersey: John Wiley & Sons,Inc., 3 edition. Flury, B. (1997), A First Course in Multivariate Statistics, New York: Springer-Verlag. Hambleton, R.K. and Swaminathan, H. (1985), Item Response Theory:Principles and Applications, Boston: Kluwer-Nijho Publishing. Johnson, V.E. and Albert, J.H. (1999), Ordinal Data Modeling, New York: Springer-Verlag. Kraemer, H.C., Periyakoil, V.S., and Noda, A. (2002), 'Tutorial in biostatistics:kappa coe cients in medical research', Statistics in Medicine, 21, 2109-2129. McDonald, R.P. (1999), Test Theory:A Uni ed Treatment, Mahwah,New Jersey:Lawrence Erlbaum Associates. Olsson, U. (1979), 'Maximum likelihood estimation of the polychoric correlation coeffcient', Psychometrika, 44(4), 443-460. Olsson, U., Drasgow, F., and Dorans, N.J. (1982), 'The polychoric correlation coefficient', Psychometrika, 47(3), 337-347. Reeve, B.B. (2002), An Introduction to Modern Measurement Theory, National Cancer Institute. Rosen, K.H. (1995), Discrete Mathematics and Its Applications, New York: McGraw-Hill, Inc., 3 edition. Serfling, R.J. (2002), Approximation Theorems of Mathematical Statistics, New York:John Wiley & Sons,Inc. Tallis, G.M. (1962), 'The maximum likelihood estimation of correlation from contingency tables', Biometrics, 18(3), 342-353. Uebersax, J.S.(2000a), 'Latent class models for analyzing agreement', URL: http://www.john-uebersax.com/stat/lcm.htm. Uebersax, J.S.(2000b), 'Latent trait models for rater agreement', URL: http://www.john-uebersax.com/stat/ltrait.htm. Uebersax, J.S.(2006), 'The tetrachotic and polychoric correlation coefficients', URL:http://www.john-uebersax.com/stat/tetra.htm, statistical Methods for Rater Agreement web site. 2006. Uebersax, J.S. and Grove, W.M. (1993), 'A latent trait nite mixture model for the analysis of rating agreement', Biometrics, 49(3), 823-835.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43752	-
dc.description.abstract	本研究的目的旨在當一大群應試者經由錯排方式分派給評分者時，探討評分者對應試者潛在特質進行評等之評分者間信度。在潛在特質模型(LTM)的假設下，polychoric 相關係數被用來當作評分者間信度。我們認為經由錯排方式將約三十萬名應試者分派給幾百位的評分者，能確保兩評分者共同評等的應試者至少上百人。在這樣的設置下，我們發現所有的評分者都會被分成幾個循環組。透過分析論證及100次的模擬結果，發現所形成的循環組數大部分不超過十組，至少有一組2-循環或3-循環的比例為0.59，而且經常產生評分者個數超過100的循環組。每位評分者所被分派到的應試者潛在特質之分配，經由Kolmogrove-Smirov test發現大部分來自於標準常態分配，僅有少數群應試者潛在特質與其他群差別在於平均數的差異。在潛在特質模型(LTM)的假設下，我們認為鑑別參數可視為評分者評等精確度的指標。同時我們也說明評分者的評等與應試者潛在特質之相關性與等級門檻(thresholds)和鑑別參數有關。兩評分者觀感潛在變數之相關係數為鑑別參數之乘積，並以兩階段的方式以polychoric相關係數來估計。藉由評分者所給的級分比例求出他們的等級門檻，鑑別參數則是藉由polychoric相關係數及適當的錯排分派方式推得。最後針對本研究的結果作個總結與建議。	zh_TW
dc.description.abstract	We investigate the inter-rater reliability when the ability of large number of examinees is classified to ordinal grade by raters through derangement. The polychoric correlation coefficient is used as inter-rater reliability when the latent trait model (LTM) is assumed. To ensure at least hundreds examinees is graded by two raters when the number of raters is around a few hundred and the number of examinees is around three hundred thousand, we consider assigning examinees to raters through derangement. Under this setting, it is found that all raters are grouped into several cycles. Through analytic argument and simulation, it is found that the number of group is often not more than ten, the probability of getting at least one cycle of size 2 or 3 is close to 0.59, and the size of largest cycle is often exceeding one hundred. It also finds that the distributions of latent trait of examinees by different raters are close to each other up to a location shift. Under the assumption of the LTM, the discriminate parameter in models can be regard as the accuracy of rating.The correlation between the grades given by raters and the latent trait of examinees was affected by the interaction of the thresholds and discriminate parameter. The correlation coefficient of perspective latent trait variables of two raters is the product of their discriminate parameter, and polychoric correlation coefficient can be estimated by two stages method. The parameter of the thresholds of raters were estimated by the proportion of rating, while as discriminate parameter can be estimates through appropriate derangement. Finally according to the result of research, we propose the summary and some suggestions.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T02:27:40Z (GMT). No. of bitstreams: 1 ntu-98-R92221028-1.pdf: 1984056 bytes, checksum: 640f5008fbad2aa74977af5f1f2b2db2 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	中文摘要…………………………………………………………iii 英文摘要…………………………………………………………iv 第一章前言……………………………………………………1 1.1研究動機與寫作量測模型………………………………… 1 1.2試題反應理埨與評分者評等模型………………………… 3 1.3潛在特質理論與潛在群組模型與評分者的評等假設…… 7 1.4評分者的評等一致性……………………………………… 8 1.5研究問題……………………………………………………12 第二章分派方式與應試者潛在特質常態性的檢驗…………14 2.1評量網絡……………………………………………………14 2.2分派方式及評分者間共同評等應試者之個數……………16 2.3隨機均等分派之常態性檢驗………………………………21 2.4隨機分派一次且群組互換之常態性檢驗…………………24 第三章評分者之量測模型假設及Thresholds估計…………27 3.1模型假設的定義及適切性…………………………………27 3.2評等與潛在特質變數的相關性……………………………30 3.3評分者間信度及量測模型之假設…………………………32 3.4 Step 1:評分者之thresholds估計………………………34 第四章評分者間信度之估計…………………………………38 4.1 兩評分者間相關係數rho之估計…………………………38 4.2 rho之 Consistency………………………………………41 4.3估計及檢定…………………………………………………52 4.4錯排方式與評分者鑑別參數beta之估計…………………53 第五章總結與討論……………………………………………56 5.1結論…………………………………………………………56 5.2研究上的限制………………………………………………57 5.3討論及建議…………………………………………………58 參考文獻…………………………………………………………59 附錄………………………………………………………………61 A n位評分者進行錯排形成可能之循環組數遞迴關係式……61 B 以R程式執行應試者群組互換及潛在特質之常態性檢驗…65
dc.language.iso	zh-TW
dc.subject	錯排	zh_TW
dc.subject	信度	zh_TW
dc.subject	polychoric 相關係數	zh_TW
dc.subject	潛在特質模型	zh_TW
dc.subject	等級門檻	zh_TW
dc.subject	reliability	en
dc.subject	derangement	en
dc.subject	thresholds	en
dc.subject	latent trait model	en
dc.subject	polychoric correlation coefficient	en
dc.title	潛在特質模型錯排分派量測下評分者間信度之探討	zh_TW
dc.title	Study on Inter-raters Reliability under the Latent Trait Model through Derangement	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	江金倉,蕭朱杏,譚克平
dc.subject.keyword	信度,polychoric 相關係數,潛在特質模型,等級門檻,錯排,	zh_TW
dc.subject.keyword	reliability,polychoric correlation coefficient,latent trait model,thresholds,derangement,	en
dc.relation.page	67
dc.rights.note	有償授權
dc.date.accepted	2009-08-17
dc.contributor.author-college	理學院	zh_TW
dc.contributor.author-dept	數學研究所	zh_TW
顯示於系所單位：	數學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 未授權公開取用	1.94 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。