開放科學的實踐：檢驗計算密集論文結果可否重現的省時方法

翁維謙; Wei-Chien Weng

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92682

標題:	開放科學的實踐：檢驗計算密集論文結果可否重現的省時方法 Enforcing Open Science: A Time-saving Test for Computationally Intensive Reproductions
作者:	翁維謙 Wei-Chien Weng
指導教授:	王道一 Joseph Tao-yi Wang
關鍵字:	個體決策,十倍交叉驗證法,模型組合,重現性檢驗,重複抽樣, Decision Making,10-fold Cross-validation,Model Ensembles,Computational Reproduction,Resampling,
出版年 :	2024
學位:	碩士
摘要:	在期刊強制要求資料與程式需公開後，Fišar et al. (2024) 發現論文結果的可重現性上升。作為互補，本文提出一種快速檢驗計算密集論文結果可否重現的方法，並以 He et al. (2022) 作為示範。該研究以每位受試者的資料，分別估計 58 個重要的風險選擇模型，並發現模型集群的預測表現優於最佳個體模型。使用原作者提供的資料與程式，我們無法重現該篇文章的結果，因為原作者程式有隱藏的錯誤導致模型集群具有較大的標準誤差。改用少部分樣本跑同樣的分析不僅能迅速抓到隱藏的程式錯誤，還可以穩健地重現原作者的主要發現。因此，期刊可採用類似的指導方針檢驗其他研究結果的重現性，毋須擔心成本過高。 Complementing the recent finding that Data and Code Disclosure policy increases the reproducibility rate in Fišar et al. (2024), this comment demonstrates a time-saving workaround to conduct computationally intensive reproductions. We take He et al. (2022) as an illustrative example, which estimates 58 prominent models of risky choice at the subject level to show that model crowds outperform the aggregate best individual model. Employing raw data and code from their replication package, we cannot reproduce the main result, since the model crowds generate much larger standard errors due to a hidden coding error. Our robustness replication with only a small fraction of the data not only catches this coding error, but successfully replicates the original finding with high power. Therefore, journals can adopt a similar paradigm as the guideline to examine computational reproducibility at a manageable cost.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92682
DOI:	10.6342/NTU202400992
全文授權:	同意授權(全球公開)
顯示於系所單位：	經濟學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf	10.7 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。