請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/4996
標題: | 以RNA序列資料偵測差異表現基因的統計方法比較 Comparisons of Statistical Methods for Detecting Differentially Expressed Genes with RNA-seq Data |
作者: | Hung-Ting Lu 呂泓廷 |
指導教授: | 林菀俞 |
關鍵字: | 次數資料,差異表現基因,蒙地卡羅模擬,次世代定序,核醣核酸定序, Count data,Differentially expressed genes,Monte-Carlo simulation,Next-generation sequencing,RNA Sequencing, |
出版年 : | 2014 |
學位: | 碩士 |
摘要: | 由於次世代定序(next-generation sequencing, NGS)的發展,核醣核酸定序(RNA-seq)實驗對基因體研究將出現改革性的影響。與既有的微陣列實驗相比,RNA-seq實驗常提供了更準確的訊號。隨著定序成本的下降,RNA-seq被認為將取代微陣列實驗。數個分析RNA-seq資料的統計方法已被發展,如edgeR、DESeq2、baySeq、TSPM、NOISeq、SAMseq、Limma、EBSeq以及PoissonSeq等,這九個統計方法在RNA-seq資料分析中常被使用。然而,這些方法之前置正規化處理、序列讀數的統計分布假設、偵測差異表現基因的統計方法以及錯誤發現率控制法皆有不同。如何選擇一個較具檢定力的方法仍是待解決的問題。在本論文中,我們針對此九個方法提供一系統性的整理與比較。除了蒙地卡羅(Monte-Carlo)模擬外,亦呈現兩則實際資料分析。根據我們的模擬結果,一般言之,Limma方法和baySeq方法的表現最好,其中Limma方法的計算時間較短,為所有方法中較具分析RNA-seq資料潛力的方法。 As the development of next-generation sequencing technologies, RNA-sequencing (RNA-seq) experiment is revolutionizing genomic studies. Compared with the existing microarray technology, RNA-seq usually provides more precise signals. RNA-seq experiment is considered to replace microarray technology when the sequencing cost decreases. Several statistical methods for RNA-seq data analysis have been developed, such as edgeR, DESeq2, baySeq, TSPM, NOISeq, SAMseq, Limma, EBSeq, and PoissonSeq, etc. These nine statistical methods are popular in current RNA-seq data analyses. However, the normalization strategies, the distribution assumptions for reads counts, the statistical methods to detect differentially expressed genes, and the false discovery rate control for the nine methods are different. How to choose a more powerful method is still an open question. In this thesis, we provide a systematic summary and comparison of these nine methods. In addition to Monte-Carlo simulation studies, two real data analyses were also performed. Based on our simulation results, generally Limma and baySeq had the best performance among the nine methods we compared. Moreover, Limma was more computationally feasible than baySeq. Among these nine methods, Limma has more potential to analyze RNA-seq data. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/4996 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 流行病學與預防醫學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-103-1.pdf | 1.85 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。