請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3772
標題: | 以Q值適性結合法來指出罕見致病變異 Pinpointing Rare Causal Variants with the Adaptive Combination of Q-values Method |
作者: | Jen-Yi Li 李貞儀 |
指導教授: | 林菀俞 |
關鍵字: | 中立變異,罕見變異,致病變異,非同義變異,次世代定序, neutral variants,rare variants,causal variants,non-synonymous variants,next-generation sequencing, |
出版年 : | 2016 |
學位: | 碩士 |
摘要: | 過去十年間,全基因組關聯研究 (genome-wide association studies, GWAS) 已指出數千個與複雜疾病有關的單一核苷酸多型性 (single-nucleotide polymorphisms, SNPs)。隨著次世代定序 (next-generation sequencing, NGS) 技術的進步,遺傳學家們得以在人類的染色體上觀察到更多遺傳訊息,使得尋找次要對偶基因頻率 (minor allele frequency, MAF) 小於1% 的罕見致病變異 (rare causal variants) 逐漸成為可能。
為了從眾多的變異中指出罕見致病變異,統計方法已陸續發展出來,例如:向後刪除法 (backward elimination procedure, 簡稱「BE」) 和P值適性結合法 (adaptive combination of P-values method, 簡稱「ADA」),已有文獻指出ADA方法辨認變異的訊號雜訊比 (signal-to-noise ratio) 高於BE方法。本文提出「Q值適性結合法」 (adaptive combination of Q-values, 簡稱「ADAQ」) 以進一步來提高發現致病變異的機率。在變異有同義/非同義註解 (synonymous / non-synonymous annotations) 的情況下,吾人首先將全部的變異分為同義變異群與非同義變異群,再使用Benjamini-Hochberg法分別將兩組內的P值轉換成Q值(簡稱B-H Q-values),繼而移除Q值較大者,因其較有可能真為中立變異 (neutral variants)。 經由模擬發現,ADAQ的陽性預測值 (positive predictive value) 較ADA更高。此外,吾人亦將ADAQ應用到遺傳分析工作坊17 (GAW 17) 的資料上,發現ADAQ較ADA更能有效地控制偽陽性 (false positives) 的個數且產生較高的陽性預測值。因此,當所研究的變異有同義/非同義註解時,吾人推薦使用ADAQ來指出個別罕見致病變異。 In the past decade, genome-wide association analyses have identified thousands of single-nucleotide polymorphisms (SNPs) associated with complex diseases. With the improvement of next-generation sequencing technology, geneticists have observed more inherited information on human chromosome. Searching for rare causal variants (minor allele frequency < 1%) gradually becomes possible. In order to pinpoint rare causal variants in a large number of variants, statistical approaches such as the BE (backward elimination) procedure and the ADA method (the adaptive combination of P-values method), have been developed. It has been shown that the signal-to-noise ratio of variants identified by ADA is larger than that of variants identified by BE. In this study, we propose an ADAQ method (‘adaptive combination of Q-values method’) to further increase the probability that a finding is genuine. With synonymous / non-synonymous annotations for variants, we first allocate all variants into a non-synonymous group and a synonymous group, and transform two groups of per-site P-values into Benjamini-Hochberg Q-values, respectively. We then remove the variants with larger Q-values that are more likely to be neutral. Comprehensive simulations have shown that ADAQ has an even larger positive predictive value than ADA. Moreover, we applied ADAQ to the Genetic Analysis Workshop 17 (GAW 17) data sets. It controls the number of false positives more effectively and generates a larger positive predictive value than ADA. Therefore, we recommend using ADAQ to pinpoint individual rare causal variants, when synonymous / non-synonymous annotations for variants are available. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3772 |
DOI: | 10.6342/NTU201602253 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 流行病學與預防醫學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-105-1.pdf | 1.85 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。