高維度平均值對等性檢定之研究

Chen-Hao Chiu; 邱振豪

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50850

標題:	高維度平均值對等性檢定之研究 A Study on High-dimensional Equivalence Test of Means
作者:	Chen-Hao Chiu 邱振豪
指導教授:	劉仁沛
關鍵字:	對等性檢定,極值分佈,兩樣本檢定,高維度資料,雙單尾檢定, Equivalence Test,Extreme value distribution,Two-sample test,High dimensional data,Two one-sided test,
出版年 :	2016
學位:	碩士
摘要:	傳統上若要進行多變數的兩樣本檢定，研究者多會使用Hotelling’s T2檢定，但由於高維度資料的其中一項特性是通常變數數目 (維度)遠大於樣本數，這會使得Hotelling’s T2檢定在高維度下因為無法計算出統計檢定量所需的樣本共變異矩陣之反矩陣而無法使用，同時也讓統計學者在相關推論上比較困難。前人們陸陸續續提出過許多改善方法，例如Cai, et al. (2014)提出在變異數矩陣同質假設及高維度下檢定兩平均向量是否相等的方法，並證明檢定統計量在變數個數趨近於無窮大時其近似分布為第一型極值分布 (Type I extreme-value distribution)。對等性 (equivalence)評估之目的在評估兩族群指標平均向量差異是否落於研究者所訂定的對等範圍內 (equivalence limits)，然而在如生物相似性產品 (biosimilar product)、基改產品 (GMO)等領域上若執行對等性評估，有可能會遇到評估的變數很多但觀測值相對很少的情況，但目前尚無高維度的對等性評估統計檢定方法。有鑑於此，我們在研究中將會提出兩種方法，第一種為根據交集-聯集原則所建立，在每個變數上都執行t對等性檢定的方法；第二種則為應用Cai, et al. (2014)的研究，命名其為最大Z2的高維度對等性檢定，其中包含了兩種程序，分別為在兩族群共變異矩陣是否已知的情況下作推導其形式與結果會有些許不同。在這份研究中我們展示大規模的模擬研究成果，並提供實際的高維度資料範例證明提出方法之應用。根據模擬研究成果顯示，對於第一種的交集-聯集方法，由於其要求每個變數都要能宣稱對等性成立，以至於太過於保守而近乎無法拒絕虛無假說來宣稱整體的對等性。而另一方面，第二種的最大Z2對等性檢定則由於只看最大值，因此不會如前種方法嚴謹，且在模擬的型一錯誤率部分，兩個程序都能夠控制在我們所要求的顯著水準附近，同時在模擬的檢定力部分，兩個程序也都能夠提供足夠大的檢定力，證明最大Z2對等性檢定除了能夠控制型一錯誤率外，也能夠在真實的對等性情況去作偵測。 Hotelling’s T2 test is a canonical multivariate two-sample test. However, due to the property of high-dimensional data that the number of variables is typically greater than the sample size, the inverse of the sample covariance matrix required by the test is not obtainable. Hotelling’s T2 test hence becomes inapplicable to high-dimensional data. Such property has become a conundrum for researchers to solve, and a few researchers have proposed improved methods accordingly. For example, Cai, et al. (2014) proposed a high-dimensional test of means under the assumption of common covariance matrix, and proved the asymptotic distribution of the test statistic, as the number of the variables becomes infinity, converges to a Type I extreme-value distribution. Equivalence testing is used for evaluating whether the mean difference between two populations falls within the equivalence limits specified by researchers. When using equivalence testing on the evaluation of biosimilar products, GMOs, etc., it is possible that the number of variables is large and the sample size is relatively small. However, previous studies on high-dimensional equivalence test of means are scarce. In the studies, we propose two equivalence tests of means. The first method is based on the intersection-union approach and executes individual equivalence t-test for every variable. The second method, maximal Z2 test, is derived from the study by Cai, et al. (2014) and is based on the maximal equivalence between the variables. The maximal Z2 test includes two different procedures, applying to the situations whether the common covariance matrix is known or not. The steps and test statistics of the two procedures are slightly different as well. We display the results of our simulation studies and also provide the demonstration of our methods on real high dimensional examples. According to our simulation studies, it suggests the intersection-union method is too conservative to reject the null hypothesis of overall equivalence because it requires every variable equivalent. On the other hand, the maximal Z2 method is expected to more liberal because it only tests on the maximal variable. Both procedures from maximal Z2 method can adequately control the empirical size at the significance level and they also both provide sufficient empirical power. The simulation studies indicate that the maximal Z2 test is a viable approach that can not only controls Type I error rate but also detects the true equivalent situation correctly.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50850
DOI:	10.6342/NTU201600679
全文授權:	有償授權
顯示於系所單位：	農藝學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 目前未授權公開取用	2.05 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。