請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15348
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 陳倩瑜(Chien-Yu Chen) | |
dc.contributor.author | Yi-An Tung | en |
dc.contributor.author | 童翊安 | zh_TW |
dc.date.accessioned | 2021-06-07T17:33:07Z | - |
dc.date.copyright | 2020-07-07 | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020-07-01 | |
dc.identifier.citation | 1. Smigielski, E.M., et al., dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res, 2000. 28(1): p. 352-5. 2. Sherry, S.T., M. Ward, and K. Sirotkin, dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res, 1999. 9(8): p. 677-9. 3. Rentzsch, P., et al., CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res, 2019. 47(D1): p. D886-D894. 4. Flanagan, S.E., A.M. Patch, and S. Ellard, Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genet Test Mol Biomarkers, 2010. 14(4): p. 533-7. 5. Lou, S., et al., GRAM: A GeneRAlized Model to predict the molecular effect of a non-coding variant in a cell-type specific manner. PLoS Genet, 2019. 15(8): p. e1007860. 6. Zhao, J., et al., Quantifying the Impact of Non-coding Variants on Transcription Factor-DNA Binding. Res Comput Mol Biol, 2017. 10229: p. 336-352. 7. Consortium, E.P., An integrated encyclopedia of DNA elements in the human genome. Nature, 2012. 489(7414): p. 57-74. 8. Davis, C.A., et al., The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res, 2018. 46(D1): p. D794-D801. 9. Bernstein, B.E., et al., The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol, 2010. 28(10): p. 1045-8. 10. Chadwick, L.H., The NIH Roadmap Epigenomics Program data resource. Epigenomics, 2012. 4(3): p. 317-24. 11. Carithers, L.J. and H.M. Moore, The Genotype-Tissue Expression (GTEx) Project. Biopreserv Biobank, 2015. 13(5): p. 307-8. 12. Consortium, G.T., The Genotype-Tissue Expression (GTEx) project. Nat Genet, 2013. 45(6): p. 580-5. 13. Ernst, J. and M. Kellis, Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc, 2017. 12(12): p. 2478-2492. 14. Ernst, J. and M. Kellis, ChromHMM: automating chromatin-state discovery and characterization. Nat Methods, 2012. 9(3): p. 215-6. 15. Skoko, D., et al., Mechanism of chromosome compaction and looping by the Escherichia coli nucleoid protein Fis. J Mol Biol, 2006. 364(4): p. 777-98. 16. Vernimmen, D., et al., Chromosome looping at the human alpha-globin locus is mediated via the major upstream regulatory element (HS -40). Blood, 2009. 114(19): p. 4253-60. 17. Flavin, M., et al., Nature of the accessible chromatin at a glucocorticoid-responsive enhancer. Mol Cell Biol, 2004. 24(18): p. 7891-901. 18. Lashmit, P., et al., The CREB site in the proximal enhancer is critical for cooperative interaction with the other transcription factor binding sites to enhance transcription of the major intermediate-early genes in human cytomegalovirus-infected cells. J Virol, 2009. 83(17): p. 8893-904. 19. Chen, K., et al., Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes. Nat Genet, 2015. 47(10): p. 1149-57. 20. Huang, M., et al., dbInDel: a database of enhancer-associated insertion and deletion variants by analysis of H3K27ac ChIP-Seq. Bioinformatics, 2020. 36(5): p. 1649-1651. 21. Sengupta, D., et al., Disruption of BRD4 at H3K27Ac-enriched enhancer region correlates with decreased c-Myc expression in Merkel cell carcinoma. Epigenetics, 2015. 10(6): p. 460-6. 22. Darrow, E.M. and B.P. Chadwick, Boosting transcription by transcription: enhancer-associated transcripts. Chromosome Res, 2013. 21(6-7): p. 713-24. 23. Raisner, R., et al., Enhancer Activity Requires CBP/P300 Bromodomain-Dependent Histone H3K27 Acetylation. Cell Rep, 2018. 24(7): p. 1722-1729. 24. Imamura, T., et al., Transcriptional Co-activators CREB-binding protein/p300 increase chondrocyte Cd-rap gene expression by multiple mechanisms including sequestration of the repressor CCAAT/enhancer-binding protein. J Biol Chem, 2005. 280(17): p. 16625-34. 25. Creyghton, M.P., et al., Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A, 2010. 107(50): p. 21931-6. 26. Hon, G., B. Ren, and W. Wang, ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput Biol, 2008. 4(10): p. e1000201. 27. Firpi, H.A., D. Ucar, and K. Tan, Discover regulatory DNA elements using chromatin signatures and artificial neural network. Bioinformatics, 2010. 26(13): p. 1579-86. 28. Rajagopal, N., et al., RFECS: a random-forest based algorithm for enhancer identification from chromatin state. PLoS Comput Biol, 2013. 9(3): p. e1002968. 29. Lu, Y., et al., DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications. PLoS One, 2015. 10(6): p. e0130622. 30. He, Y., et al., Improved regulatory element prediction based on tissue-specific local epigenomic signatures. Proc Natl Acad Sci U S A, 2017. 114(9): p. E1633-E1640. 31. Liu, F., et al., PEDLA: predicting enhancers with a deep learning-based algorithmic framework. Sci Rep, 2016. 6: p. 28517. 32. Visel, A., et al., VISTA Enhancer Browser--a database of tissue-specific human enhancers. Nucleic Acids Res, 2007. 35(Database issue): p. D88-92. 33. Lizio, M., et al., Update of the FANTOM web resource: expansion to provide additional transcriptome atlases. Nucleic Acids Res, 2019. 47(D1): p. D752-D758. 34. Kleftogiannis, D., P. Kalnis, and V.B. Bajic, DEEP: a general computational framework for predicting enhancers. Nucleic Acids Res, 2015. 43(1): p. e6. 35. Bu, H., et al., A new method for enhancer prediction based on deep belief network. BMC Bioinformatics, 2017. 18(Suppl 12): p. 418. 36. Yang, B., et al., BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone. Bioinformatics, 2017. 33(13): p. 1930-1936. 37. Schwessinger, R., et al., DeepC: Predicting chromatin interactions using megabase scaled deep neural networks and transfer learning. bioRxiv, 2019: p. 724005. 38. Tamminga, C.A., The human genome sequence: the human genome ii: sources of genetic variation. Am J Psychiatry, 2001. 158(5): p. 691. 39. Nelson, N.J., Draft human genome sequence yields several surprises. J Natl Cancer Inst, 2001. 93(7): p. 493. 40. Venter, J.C., et al., The sequence of the human genome. Science, 2001. 291(5507): p. 1304-51. 41. Eddy, S.R., Non-coding RNA genes and the modern RNA world. Nat Rev Genet, 2001. 2(12): p. 919-29. 42. Schneider, R. and R. Grosschedl, Dynamics and interplay of nuclear architecture, genome organization, and gene expression. Genes Dev, 2007. 21(23): p. 3027-43. 43. Alberts, B., et al., From RNA to protein, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science. 44. Alberts, B., et al., From DNA to RNA, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science. 45. Bischof, O. and R.I. Martínez‐Zamudio, MicroRNAs and lncRNAs in senescence: A re‐view. IUBMB life, 2015. 67(4): p. 255-267. 46. Cao, M.-x., et al., The crosstalk between lncRNA and microRNA in cancer metastasis: orchestrating the epithelial-mesenchymal plasticity. Oncotarget, 2017. 8(7): p. 12472. 47. Gygi, S.P., et al., Correlation between protein and mRNA abundance in yeast. Molecular and cellular biology, 1999. 19(3): p. 1720-1730. 48. Noller, H.F., Ribosomal RNA and translation. Annual review of biochemistry, 1991. 60(1): p. 191-227. 49. Brar, G.A. and J.S. Weissman, Ribosome profiling reveals the what, when, where and how of protein synthesis. Nature reviews Molecular cell biology, 2015. 16(11): p. 651-664. 50. Odom, D.T., Identification of transcription factor–DNA interactions in vivo, in A Handbook of Transcription Factors. 2011, Springer. p. 175-191. 51. Nikolov, D. and S. Burley, RNA polymerase II transcription initiation: a structural view. Proceedings of the National Academy of Sciences, 1997. 94(1): p. 15-22. 52. McPherson, C.E., et al., An active tissue-specific enhancer and bound transcription factors existing in a precisely positioned nucleosomal array. Cell, 1993. 75(2): p. 387-398. 53. Heinz, S., et al., The selection and function of cell type-specific enhancers. Nature reviews. Molecular cell biology, 2015. 16(3): p. 144-154. 54. Schwarzer, W. and F. Spitz, The architecture of gene expression: integrating dispersed cis-regulatory modules into coherent regulatory domains. Current opinion in genetics development, 2014. 27: p. 74-82. 55. Kim, T.-K. and R. Shiekhattar, Architectural and Functional Commonalities between Enhancers and Promoters. Cell, 2015. 162(5): p. 948-959. 56. Pennacchio, L.A., et al., Enhancers: five essential questions. Nat Rev Genet, 2013. 14(4): p. 288-95. 57. Kim, T.-K., et al., Widespread transcription at neuronal activity-regulated enhancers. Nature, 2010. 465(7295): p. 182-187. 58. Shlyueva, D., G. Stampfel, and A. Stark, Transcriptional enhancers: from properties to genome-wide predictions. Nature Reviews Genetics, 2014. 15(4): p. 272-286. 59. Hirschhorn, J.N. and M.J. Daly, Genome-wide association studies for common diseases and complex traits. Nature reviews genetics, 2005. 6(2): p. 95-108. 60. Symmons, O. and F. Spitz, From remote enhancers to gene regulation: charting the genome's regulatory landscapes. Philos Trans R Soc Lond B Biol Sci, 2013. 368(1620): p. 20120358. 61. Qiu, Z., J. Wang, and Y. Wu, The Landscape of Histone Modification in Cancer Metastasis. 2018. 62. Alberts, B., et al., Chromosomal DNA and its packaging in the chromatin fiber, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science. 63. Cutter, A.R. and J.J. Hayes, A brief review of nucleosome structure. FEBS Lett, 2015. 589(20 Pt A): p. 2914-22. 64. Audia, J.E. and R.M. Campbell, Histone Modifications and Cancer. Cold Spring Harb Perspect Biol, 2016. 8(4): p. a019521. 65. Marino-Ramirez, L., et al., Histone structure and nucleosome stability. Expert Rev Proteomics, 2005. 2(5): p. 719-29. 66. Sadakierska-Chudy, A. and M. Filip, A comprehensive view of the epigenetic landscape. Part II: Histone post-translational modification, nucleosome level, and chromatin regulation by ncRNAs. Neurotox Res, 2015. 27(2): p. 172-97. 67. Bartova, E., et al., Histone modifications and nuclear architecture: a review. J Histochem Cytochem, 2008. 56(8): p. 711-21. 68. Lennartsson, A. and K. Ekwall, Histone modification patterns and epigenetic codes. Biochim Biophys Acta, 2009. 1790(9): p. 863-8. 69. Chrun, E.S., F. Modolo, and F.I. Daniel, Histone modifications: A review about the presence of this epigenetic phenomenon in carcinogenesis. Pathol Res Pract, 2017. 213(11): p. 1329-1339. 70. Dixon, J.R., et al., Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 2012. 485(7398): p. 376-80. 71. Cavalli, G. and T. Misteli, Functional implications of genome topology. Nat Struct Mol Biol, 2013. 20(3): p. 290-9. 72. Fudenberg, G., et al., High order chromatin architecture shapes the landscape of chromosomal alterations in cancer. Nat Biotechnol, 2011. 29(12): p. 1109-13. 73. Denker, A. and W. de Laat, The second decade of 3C technologies: detailed insights into nuclear organization. Genes Dev, 2016. 30(12): p. 1357-82. 74. de Wit, E. and W. de Laat, A decade of 3C technologies: insights into nuclear organization. Genes Dev, 2012. 26(1): p. 11-24. 75. Sexton, T., et al., Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 2012. 148(3): p. 458-72. 76. Espinoza, C.A. and B. Ren, Mapping higher order structure of chromatin domains. Nat Genet, 2011. 43(7): p. 615-6. 77. Pavlidis, P., et al., Promoter region-based classification of genes. Pac Symp Biocomput, 2001: p. 151-63. 78. Degroeve, S., et al., Feature subset selection for splice site prediction. Bioinformatics, 2002. 18 Suppl 2: p. S75-83. 79. Schölkopf, B., K. Tsuda, and J.-P. Vert, Kernel methods in computational biology. Computational molecular biology. 2004, Cambridge, Mass.: MIT Press. ix, 400 p. 80. Zheng, J., et al., Deep-RBPPred: Predicting RNA binding proteins in the proteome scale based on deep learning. Scientific reports, 2018. 8(1): p. 1-9. 81. Alipanahi, B., et al., Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature biotechnology, 2015. 33(8): p. 831-838. 82. Schneider, V.A., et al., Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome research, 2017. 27(5): p. 849-864. 83. Roadmap Epigenomics, C., et al., Integrative analysis of 111 reference human epigenomes. Nature, 2015. 518(7539): p. 317-30. 84. Downes, D., et al., An integrated platform to systematically identify causal variants and genes for polygenic human traits. 2019, bioRxiv. 85. Zhou, J. and O.G. Troyanskaya, Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods, 2015. 12(10): p. 931-4. 86. Kelley, D.R., J. Snoek, and J.L. Rinn, Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res, 2016. 26(7): p. 990-9. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15348 | - |
dc.description.abstract | 增強子是一類重要的調節元件,過去許多研究中已顯示出增強子是輔助啟動子調節細胞基因表達的關鍵角色。目前,人類基因體中,增強子的數目及其在不同細胞中的活性,仍存在有許多未知。在過往的研究發現,增強子的活性與一些功能性的資料相關,例如:組蛋白修飾、序列特徵以及染色質的結構與開合程度等等。在本論文中,我們主要利用DNase以及其他組蛋白修飾數據建立了深度學習模型,並且以H3K27ac峰值作為所選細胞類型中的增強子位置進行訓練與預測。此外,本研究還設計了結合多種細胞類型的聯合訓練(Joint training),用以提高模型預測性能。透過我們所提出的深度學習模型accuEnhancer,我們展示了利用完整特徵資料集以及深度學習於預測單一種類細胞株內增強子活性的可行性,其準確性和F1可以達到0.97和0.9。為了更進一步進行跨細胞株的增強子活性預測,本論文提出通過整合來自不同細胞類型的數據來提高跨細胞類型預測的性能。隨著結合來自不同細胞類型的更多訓練數據,預測獨立細胞類型的F1從0.3上升至0.80。結果表明,通過合併更多的跨細胞類型的數據集,深度學習模型可以捕獲複雜的調控模式並提供更好的性能。最後,本研究測試了accuEnhancer模型在預測VISTA實驗驗證的增強子數據庫的有效性。結果顯示accuEnhancer在預測經過實驗驗證的增強子能勝過前人的其他方法,本論文因此探討了跨細胞株,乃至於跨物種預測的可行性。 | zh_TW |
dc.description.abstract | Enhancers are one class of the regulatory elements that have been shown to act as key components to assist promoters in modulating the gene expression in living cells. At present, the number of enhancers in the human genome as well as their activities in different cell types are still largely unknown. Previous studies have shown that enhancer activities are associated with some functional data, such as histone modification, sequence motifs, and chromatin accessibilities. This study utilized DNase data to build a deep learning model for predicting the H3K27ac peaks as the active enhancers. Moreover, this thesis proposed joint training of multiple cell types to boost the model performance. The analyses conducted in this thesis first demonstrated the general feasibility of accuEnhancer to predict within-cell type enhancer activities, where the accuracy and the F1 score can achieve 0.97 and 0.9, respectively. To further predict cell type-specific enhancers by cross-cell type modeling, we integrated the training data from different cell types to boost the model performance. The F1 score increased from 0.3 to 0.80 as the model combined more training data from different cell types. The results demonstrated that by incorporating more datasets across cell types, the complex regulatory patterns could be captured by the deep neural networks to deliver better performances. Lastly, this study tested the effectiveness of the model on predicting experimentally validated enhancers in the VISTA database. The results indicated that accuEnhancer outperforms the previous works in predicting cell type-specific enhancer activities by cross-cell type modeling. | en |
dc.description.provenance | Made available in DSpace on 2021-06-07T17:33:07Z (GMT). No. of bitstreams: 1 U0001-2906202023270600.pdf: 12683118 bytes, checksum: aace81bae36059071749f1ff82f5258a (MD5) Previous issue date: 2020 | en |
dc.description.tableofcontents | 中文摘要 1 Abstract 2 Table of Contents 4 List of Figures 6 List of Tables 7 1. Introduction 8 2. Background 10 2.1. The genome 10 2.1.1. Transcription and translation 11 2.1.2. Transcription factors 12 2.1.3. Enhancers. 12 2.2. The epigenome 13 2.2.1. Histone modifications 13 2.2.2. Chromatin structure 14 3. Materials and Methods 14 3.1. Machine learning concepts and bioinformatics applications 14 3.1.1. Feature selection 15 3.2. Deep learning on biological data 15 3.3. Genomic sequences 16 3.4. ENCODE database 16 3.5. Construction of the positive and negative sets 17 3.6. DeepHaem model backbones 19 3.7. The architecture of accuEnhancer 20 3.8. Predicting within cell type active enhancers 21 3.9. Predicting cross-cell type active enhancers 21 3.10. VISTA enhancer browser 22 4. Results 22 4.1. Predicting within cell type active enhancers 22 4.2. impact of utilizing pretrained weights 25 4.3. Evaluation of feature importance 26 4.4. Predicting the cross-cell type active enhancers 28 4.5. Integration data from multiple cell types helps to identify novel active enhancers 31 5. Conclusion 36 6. References 38 7. Appendices 46 | |
dc.language.iso | en | |
dc.title | 利用深度學習建構跨細胞株模型預測增強子之細胞株特異性活性 | zh_TW |
dc.title | Predicting cell type-specific enhancer activities by cross-cell type modeling with deep learning | en |
dc.type | Thesis | |
dc.date.schoolyear | 108-2 | |
dc.description.degree | 博士 | |
dc.contributor.coadvisor | 歐陽彥正(Yen-Jen Oyang) | |
dc.contributor.oralexamcommittee | 阮雪芬(Hsueh-Fen Juan),黃宣誠(Hsuan-Cheng Huang),蔡懷寬(Huai-Kuang Tsai) | |
dc.subject.keyword | 機器學習,深度學習,增強子,基因調控網路,跨細胞株預測,增強子預測模型, | zh_TW |
dc.subject.keyword | accuEnhancer,Machine learning,deep learning,enhancer,Gene regulatory network,cross-cell type prediction, | en |
dc.relation.page | 52 | |
dc.identifier.doi | 10.6342/NTU202001200 | |
dc.rights.note | 未授權 | |
dc.date.accepted | 2020-07-01 | |
dc.contributor.author-college | 生命科學院 | zh_TW |
dc.contributor.author-dept | 基因體與系統生物學學位學程 | zh_TW |
顯示於系所單位: | 基因體與系統生物學學位學程 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-2906202023270600.pdf 目前未授權公開取用 | 12.39 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。