Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 生物資源暨農學院
  3. 農藝學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50612
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor蔡政安(Chen-An Tsai)
dc.contributor.authorPei-Hsun Lien
dc.contributor.author李沛洵zh_TW
dc.date.accessioned2021-06-15T12:48:44Z-
dc.date.available2019-07-26
dc.date.copyright2016-07-26
dc.date.issued2016
dc.date.submitted2016-07-21
dc.identifier.citationReferences
1. Tian, L., et al., Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci U S A, 2005. 102(38): p. 13544-9.
2. Subramanian, A., et al., Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A, 2005. 102(43): p. 15545-50.
3. Wang, Z., M. Gerstein, and M. Snyder, RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet, 2009. 10(1): p. 57-63.
4. Nam, D., De-correlating expression in gene-set analysis. Bioinformatics, 2010. 26(18): p. i511-6.
5. Robinson, M.D. and A. Oshlack, A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol, 2010. 11(3): p. R25.
6. Rahmatallah, Y., F. Emmert-Streib, and G. Glazko, Gene set analysis for self-contained tests: complex null and specific alternative hypotheses. Bioinformatics, 2012. 28(23): p. 3073-80.
7. Szekely, G.J. and M.L. Rizzo, TESTING FOR EQUAL DISTRIBUTIONS
IN HIGH DIMENSION. InterStat, 2004: p. 2004.
8. Yaari, G., et al., Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic Acids Res, 2013. 41(18): p. e170.
9. Mortazavi, A., et al., Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods, 2008. 5(7): p. 621-8.
10. Bullard, J.H., et al., Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics, 2010. 11: p. 94.
11. Law, C.W., et al., voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol, 2014. 15(2): p. R29.
12. Friedman, J.H. and L.C. Rafsky, Multivariate generalizations of the Wald-Wolfowitz and Smirnov two sample tests. The Annals of Statistics, 1979. 7: p. 697-717.
13. Schafer, J. and K. Strimmer, A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics. Statistical Applications in Genetics and Molecular Biology, 2005. 4(1): p. 32.
14. Rahmatallah, Y., F. Emmert-Streib, and G. Glazko, Comparative evaluation of gene set analysis approaches for RNA-Seq data. BMC Bioinformatics, 2014. 15: p. 397.
15. Pickrell, J.K., et al., Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 2010. 464(7289): p. 768-72.
16. Yahav, I., An Elegant Method for Generating Multivariate Poisson Random Variables. arXiv:0710.5670v2 [stat.CO], 2008.
17. Zhang, W., et al., Gene set enrichment analyses revealed differences in gene
expression patterns between males and females. Silico Biol, 2009. 9(3): p. 55-63.
18. Brannon, A.R., et al., Meta-analysis of clear cell renal cell carcinoma gene expression defines a variant subgroup and identifies gender influences on tumor biology. Eur Urol, 2012. 61(2): p. 258-68.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50612-
dc.description.abstractDuring the past few years, RNA-Seq technology has been widely employed for studying the transcriptome since it has clear advantages over the other transcriptomic technologies. The most popular use of RNA-seq applications is to identify differentially expressed genes. In addition, gene set analysis (GSA) aims to determine whether a predefined gene set, in which the genes share a common biological function, is correlated with the pheno-type. To date, many GSA approaches have been developed for identifying differentially expressed gene sets using microarray data. However, these methods are not directly ap-plicable to RNA-seq data due to intrinsic difference between two data structures. When testing the differential expression of gene sets, there is a critical assumption that the mem-bers in each gene set are sampled independently in most GSA methods. It means that the genes within a gene set don’t share a common biological function. In order to resolve this issue, we propose a GSA method based on the De-correlation (DECO) algorithm by Dougu Nam (2010) to remove the correlation bias in the expression of each gene set. We study the performance of our proposed method compared with other GSA methods through simulation studies under various scenarios combining with four different normal-ization methods. As a result, we found that our proposed method outperforms the others in terms of Type I error rate and empirical power.en
dc.description.provenanceMade available in DSpace on 2021-06-15T12:48:44Z (GMT). No. of bitstreams: 1
ntu-105-R03621207-1.pdf: 4583901 bytes, checksum: fe45314cbbfbc8546ad7cef6a895d0b1 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontentsContents
摘要 ii
Abstract iii
1. Introduction 1
1.1 Brief Overview of RNA-Seq technology 2
1.2 RNA-Seq data 3
1.3 Gene Set Analysis in RNA-Seq 4
2. Background 6
2.1 Methods for Gene Set Analysis 7
2.2 Normalization methods 11
3. The Proposed Method 16
3.1 DECO: an algorithm to remove correlation from data 17
3.2 Estimating a covariance matrix with a small sample size 19
3.3 Gene Set Analysis with DECO 20
4. Numerical Studies 22
4.1 Simulation of RNA-Seq count data 22
4.2 Biological data set 23
4.3 Simulation study 24
4.4 Type I error rate 26
4.5 Empirical power 28
4.6 Real Data Analysis 29
5. Discussion and Conclusions 32
Tables 36
Figures 41
References 78
dc.language.isoen
dc.title次世代定序資料之基因富集分析zh_TW
dc.titleGene Set Enrichment Analysis of RNA-Seq dataen
dc.typeThesis
dc.date.schoolyear104-2
dc.description.degree碩士
dc.contributor.oralexamcommittee劉仁沛,蔡欣甫
dc.subject.keywordRNA-Seq,基因集分析,差異表現,DECO理論,相關偏差,zh_TW
dc.subject.keywordRNA-Seq,gene set analysis,differentially expressed,DECO,correlation bias,en
dc.relation.page80
dc.identifier.doi10.6342/NTU201601170
dc.rights.note有償授權
dc.date.accepted2016-07-22
dc.contributor.author-college生物資源暨農學院zh_TW
dc.contributor.author-dept農藝學研究所zh_TW
顯示於系所單位:農藝學系

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  目前未授權公開取用
4.48 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved