多重檢定校正方法於處理基因體資料之剖析

Wan-Yu Lin; 林菀俞

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22588

標題:	多重檢定校正方法於處理基因體資料之剖析 On multiple hypotheses testing for genomic studies
作者:	Wan-Yu Lin 林菀俞
指導教授:	李文宗(Wen-Chung Lee)
關鍵字:	錯誤發現率,基因表現量,微陣列,多重比較,多重檢定,同時推論, false discovery rate,gene expression,microarray,multiple comparisons,multiple hypothesis testing,simultaneous inference,
出版年 :	2010
學位:	博士
摘要:	本文根據全基因關聯分析資料與基因表現量資料，探討在研究者可以獲得先驗知識的情形下，依據先驗知識將所有的單一核苷酸多型性或基因表現量概略地區分為兩群，再分別去執行錯誤發現率 (false discovery rate, FDR) 控制法。由於全基因關聯分析資料與基因表現量資料本質上的不同，適合二者的方法亦有不同。此外，我們運用經驗貝氏法對多重檢定作區間估計。隨著檢定個數增加，經驗貝氏法的信賴區間平均涵蓋真實參數的機率 (average coverage probability) 逐漸趨於期望標準，且區間長度短於傳統的信賴區間，表示檢定的精確度得以提升。除了提高相對效率，縮短信賴區間之外，更大的好處是可以同時作多重檢定之校正，這是傳統信賴區間所無法辦到的。我們且進一步應用此法於老年黃斑部病變 (age-related macular degeneration, AMD) 之基因關聯分析資料。此經驗貝氏多重檢定區間估計將可應用在流行病學或遺傳流行病學上，提供研究者一個易於執行、精確且又同時處理了多重檢定校正的方法。 In genomic studies, we are often confronted with a large number of genes or markers like single-nucleotide polymorphisms (SNPs). Examples include genome-wide association studies (GWASs) and gene expression data analyses. The substantial amount of data produced by current high-throughput technologies has brought opportunities and difficulties for statisticians. With the number of SNPs going into millions comes the challenge of multiple-testing adjustment. To counteract such a harsh multiple-testing penalty, we incorporate prior knowledge to facilitate discoveries in a GWAS on age-related macular degeneration. We also propose a floating prioritized subset analysis, serving as a powerful method to detect differentially expressed genes. Furthermore, we develop a simple method to improve the efficiency of an epidemiological study. The effects in a study are assumed to arise from an unspecified prior distribution with an unknown mean and an unknown variance. We use the observed log odds ratios (logORs) and their variances to estimate the prior mean and the prior variance. And the posterior distribution will be asymptotically normally distributed regardless of the prior distribution of these effects. Based on these, we proposed a simple formula to tighten the confidence limits of the many logORs reported in a study. We evaluate the performances of our method by simulation studies. We found that the proposed method can indeed booster efficiency while maintaining the average coverage probability at a desired level. The larger the ratio of the prior variance to the average variances of logORs, the smaller the total number of logORs is needed to guarantee the coverage probability. For example, when the ratio of the prior variance to the average variances of logORs is 2, the efficiency of our method relative to the traditional confidence interval method is 1.5, and roughly 50 logORs can guarantee the coverage probability. We further apply our method to a genetic epidemiological study on age-related macular degeneration to demonstrate the tightening of the confidence limits. Our method is easy to implement and is to be recommended for improving the efficiency of a study.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22588
全文授權:	未授權
顯示於系所單位：	流行病學與預防醫學研究所

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	763.9 kB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。