請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52139完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 蔡政安 | |
| dc.contributor.author | Yu-Shiang Zeng | en |
| dc.contributor.author | 曾禹翔 | zh_TW |
| dc.date.accessioned | 2021-06-15T16:08:32Z | - |
| dc.date.available | 2015-08-25 | |
| dc.date.copyright | 2015-08-25 | |
| dc.date.issued | 2015 | |
| dc.date.submitted | 2015-08-19 | |
| dc.identifier.citation | [1] Sanger, F., Nicklen, S. & Coulson, A. R. DNA sequencing with chain-terminating
inhibitors. Proc. Natl. Acad. Sci. U. S. A. 74, 5463–5467 (1977). [2] Heiger, D. N., Cohen, A. S. & Karger, B. L. Separation of DNA restriction fragments by high performance capillary electrophoresis with low and zero crosslinked polyacrylamide using continuous and pulsed electric fields. J. Chromatogr. 516, 33–48 (1990). [3] Mani, U., Mukund, S. & Ravisankar, S. Sanger method of DNA sequencing (2014). URL http://www.bioindians.org/index.html. [4] Drmanac, R., Labat, I., Brukner, I. & Crkvenjakov, R. Sequencing of megabase plus DNA by hybridization: theory of the method. Genomics 4, 114–128 (1989). [5] Maxam, A. M. & Gilbert, W. A new method for sequencing DNA. Proc. Natl. Acad. Sci. U. S. A. 74, 560–564 (1977). [6] Shendure, J. & Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008). [7] Barba, M., Czosnek, H. & Hadidi, A. Historical perspective, development and applications of next-generation sequencing in plant virology. Viruses 6, 106–36 (2014). [8] Branton, D. et al. The potential and challenges of nanopore sequencing. Nat. Biotechnol. 26, 1146–1153 (2008). [9] Visel, A. et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature 457, 854–858 (2009). [10] Wang, L., Li, P. & Brutnell, T. P. Exploring plant transcriptomes using ultra highthroughput sequencing. Briefings Funct. Genomics Proteomics 9, 118–128 (2010). [11] Oshlack, A., Robinson, M. D. & Young, M. D. From RNA-seq reads to differential expression results. Genome Biol. 11, 220 (2010). [12] Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. comparison with gene expression arrays RNA-seq : An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 1509–1517 (2008). [13] Robinson, M. D. & Smyth, G. K. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881–2887 (2007). [14] Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010). [15] Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010). [16] Glaus, P., Honkela, A. & Rattray, M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics 28, 1721–1728 (2012). [17] Hastings, W. K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970). [18] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of State Calculations by Fast Computing Machines. J. Chem. Phys. 21, 1087–1092 (1953). [19] Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31, 46–53 (2013). [20] Feng, J. et al. GFOLD: A generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics 28, 2782–2788 (2012). [21] Seesi, S. A., Tiagueu, Y. T. & Zelikovsky, A. Bootstrap-based differential gene expression analysis for RNA-Seq data with and without replicates. BMC Genomic 15, 1–6 (2014). [22] Wu, H., Wang, C. & Wu, Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics 14, 232–243 (2013). [23] Lu, J., Tomfohr, J. K. & Kepler, T. B. Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach. BMC Bioinformatics 6, 165 (2005). [24] Robinson, M. D. & Smyth, G. K. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9, 321–332 (2008). [25] Landau, W. M. & Liu, P. Dispersion estimation and its effect on test performance in RNA-seq data analysis: A simulation-based comparison of methods. PLoS One 8 (2013). [26] Pickrell, J. K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010). [27] Hammer, P. et al. mRNA-seq with agnostic splice site discovery for nervous system transcriptomics tested in chronic pain. Genome Res. 20, 847–860 (2010). [28] ’t Hoen, P. et al. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 36 (2008). [29] Lai, Y. Differential expression analysis of digital gene expression data: RNA-tag filtering, comparison of t-type tests and their genomewide co-expression based adjustments. Changes 29, 997–1003 (2012). | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52139 | - |
| dc.description.abstract | 近來隨著次世代定序技術發展愈來愈快速以及日趨成熟,這項科技
已經在各個領域廣泛的被使用到,如醫學、農業、生物科技等等。次世 代定序技術可以用來做全基因體定序,也可以將一些已知的物種重新 定序,更可以探討在生物性上的理論,而其中一項重要的應用就是轉 錄體定序(RNA-seq) 資料。轉錄體定序資料常被用來檢定基因表現量, 近年來,轉錄體定序資料已漸漸取代微陣列資料(Microarray) 成為研究基因表現量的一個指標。然而在探討轉錄體定序資料時,由於它是屬 於離散型變數,且資料會發生變異數大於平均值的現象,這種現象我 們稱作過度離異(over-dispersion)。我們通常會用負二項分配(Negative Binomial Model) 解決過度離異問題,但如何估計模型中的參數,這其 中又牽涉到許多統計方法。近來常見的如DESeq、edgeR 跟DSS 都是 在分析上常用的方法。但這幾種方法都是用點估計來估計參數,並沒 有將不確定性考慮進去。在本論文中,我們建立了兩個模型,分別為 對數線性模型,以及貝氏階層模型,利用馬可夫鏈蒙地卡羅(MCMC) 的方法得到我們有興趣的參數,進而可以找出表現量不同的基因。最 後我們分別利用模擬資料以及實際資料來評估DESeq、edgeR、DSS 以 及我們方法的好壞。其中我們發現當各組的重複數接近甚至相同的時 候,我們的線性對數模型相較於其他方法是表現較好的;而當重複數 如果是極端不平衡的情況之下,我們會建議利用中位數估計法來進行 檢定。 | zh_TW |
| dc.description.abstract | With the rapid development of Next Generation Sequencing technology, plenty of industries such as medical science, agriculture and bio-technology are taken to the next level. Next Generation Sequencing technology makes
whole genome sequencing and de novo sequencing possible to explore the biology-based theory; besides, RNA-seq data is one of the core applications of Next Generation Sequencing technology. RNA-seq data is to obtain the gene expression level and to test whether specific gene is differentially expressed. Recently, RNA-seq data has replaced Microarray technology and becomes the important benchmark of gene expression test gradually. However, because of the discrete RNA-Seq read counts, the phenomena of over-dispersion (the variance of the data is larger than the mean) will occur. To deal with over-dispersion problem, negative binomial model is applied; however, the parameter estimation is another issue to be considered. Nowadays, some analysis softwares for RNA-seq data like DESeq, edgeR and DSS only use point estimation to obtain the parameters without considering the uncertainty in RNA-seq data. Here, we use Markov chain Monte Carlo (MCMC) method to obtain the estimates of parameters that it may be concerned with detecting the differentially expressed genes. In the end of the thesis, we compare the performance of DESeq, edgeR, DSS and our method by both simulated and real RNA-seq data. Our log-linear model performs much more superior than DESeq, edgeR and DSS while the replicates between groups are close or same. Besides, when the number of replicates between groups is extremely unbalanced, then we suggest that median estimator would be the proper method for detecting differentially expressed genes. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T16:08:32Z (GMT). No. of bitstreams: 1 ntu-104-R02621208-1.pdf: 4843119 bytes, checksum: dd113e6a98dcda218806bc2966988583 (MD5) Previous issue date: 2015 | en |
| dc.description.tableofcontents | 摘要ii
Abstract iv 1 Introduction 1 1.1 Brief Overview of RNA-seq Studies . . . . . . . . . . . . . . . . . . . . 1 1.2 Challenges in Analysis Methods for RNA-seq Data . . . . . . . . . . . . 5 1.3 Contributions of Our Proposed Method . . . . . . . . . . . . . . . . . . 7 2 Review of Current Methods 9 2.1 The Dispersion Shrinkage for Sequencing (DSS) . . . . . . . . . . . . . 9 2.2 Moderated Statistical Tests for Assessing Differences in Tag Abundance (edgeR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Differential Expression Analysis for Sequence Count Data (DESeq) . . . 14 3 The Proposed Methods 18 3.1 Gamma-Poisson Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Log-Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 Numerical Studies 23 4.1 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Control of Type I Error Rate . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3 Estimation of ϕ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 Accuracy of DE Detection . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.5 FDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.6 Test Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.7 Real Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 Discussion and Conclusions 34 Bibliography 81 | |
| dc.language.iso | zh-TW | |
| dc.subject | 轉錄體定序資料 | zh_TW |
| dc.subject | 基因表現量 | zh_TW |
| dc.subject | 貝氏分析 | zh_TW |
| dc.subject | 對數線性模型 | zh_TW |
| dc.subject | Gene expression | en |
| dc.subject | Log-linear model | en |
| dc.subject | Bayesian inference | en |
| dc.subject | RNA-seq | en |
| dc.title | 以貝氏分析方法來偵測轉錄體定序資料之顯著基因 | zh_TW |
| dc.title | Identification of Differentially Expressed Genes of
RNA-Seq Data based on Bayesian Approaches | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 103-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 劉仁沛,劉力瑜,謝叔蓉 | |
| dc.subject.keyword | 轉錄體定序資料,基因表現量,貝氏分析,對數線性模型, | zh_TW |
| dc.subject.keyword | RNA-seq,Gene expression,Bayesian inference,Log-linear model, | en |
| dc.relation.page | 84 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2015-08-19 | |
| dc.contributor.author-college | 生物資源暨農學院 | zh_TW |
| dc.contributor.author-dept | 農藝學研究所 | zh_TW |
| 顯示於系所單位: | 農藝學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-104-1.pdf 未授權公開取用 | 4.73 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
