Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 生命科學院
  3. 基因體與系統生物學學位學程
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15348
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor陳倩瑜(Chien-Yu Chen)
dc.contributor.authorYi-An Tungen
dc.contributor.author童翊安zh_TW
dc.date.accessioned2021-06-07T17:33:07Z-
dc.date.available2030-01-01-
dc.date.copyright2020-07-07
dc.date.issued2020
dc.date.submitted2020-07-01
dc.identifier.citation1. Smigielski, E.M., et al., dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res, 2000. 28(1): p. 352-5.
2. Sherry, S.T., M. Ward, and K. Sirotkin, dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res, 1999. 9(8): p. 677-9.
3. Rentzsch, P., et al., CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res, 2019. 47(D1): p. D886-D894.
4. Flanagan, S.E., A.M. Patch, and S. Ellard, Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genet Test Mol Biomarkers, 2010. 14(4): p. 533-7.
5. Lou, S., et al., GRAM: A GeneRAlized Model to predict the molecular effect of a non-coding variant in a cell-type specific manner. PLoS Genet, 2019. 15(8): p. e1007860.
6. Zhao, J., et al., Quantifying the Impact of Non-coding Variants on Transcription Factor-DNA Binding. Res Comput Mol Biol, 2017. 10229: p. 336-352.
7. Consortium, E.P., An integrated encyclopedia of DNA elements in the human genome. Nature, 2012. 489(7414): p. 57-74.
8. Davis, C.A., et al., The Encyclopedia of DNA elements (ENCODE): data portal update. Nucleic Acids Res, 2018. 46(D1): p. D794-D801.
9. Bernstein, B.E., et al., The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol, 2010. 28(10): p. 1045-8.
10. Chadwick, L.H., The NIH Roadmap Epigenomics Program data resource. Epigenomics, 2012. 4(3): p. 317-24.
11. Carithers, L.J. and H.M. Moore, The Genotype-Tissue Expression (GTEx) Project. Biopreserv Biobank, 2015. 13(5): p. 307-8.
12. Consortium, G.T., The Genotype-Tissue Expression (GTEx) project. Nat Genet, 2013. 45(6): p. 580-5.
13. Ernst, J. and M. Kellis, Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc, 2017. 12(12): p. 2478-2492.
14. Ernst, J. and M. Kellis, ChromHMM: automating chromatin-state discovery and characterization. Nat Methods, 2012. 9(3): p. 215-6.
15. Skoko, D., et al., Mechanism of chromosome compaction and looping by the Escherichia coli nucleoid protein Fis. J Mol Biol, 2006. 364(4): p. 777-98.
16. Vernimmen, D., et al., Chromosome looping at the human alpha-globin locus is mediated via the major upstream regulatory element (HS -40). Blood, 2009. 114(19): p. 4253-60.
17. Flavin, M., et al., Nature of the accessible chromatin at a glucocorticoid-responsive enhancer. Mol Cell Biol, 2004. 24(18): p. 7891-901.
18. Lashmit, P., et al., The CREB site in the proximal enhancer is critical for cooperative interaction with the other transcription factor binding sites to enhance transcription of the major intermediate-early genes in human cytomegalovirus-infected cells. J Virol, 2009. 83(17): p. 8893-904.
19. Chen, K., et al., Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor-suppressor genes. Nat Genet, 2015. 47(10): p. 1149-57.
20. Huang, M., et al., dbInDel: a database of enhancer-associated insertion and deletion variants by analysis of H3K27ac ChIP-Seq. Bioinformatics, 2020. 36(5): p. 1649-1651.
21. Sengupta, D., et al., Disruption of BRD4 at H3K27Ac-enriched enhancer region correlates with decreased c-Myc expression in Merkel cell carcinoma. Epigenetics, 2015. 10(6): p. 460-6.
22. Darrow, E.M. and B.P. Chadwick, Boosting transcription by transcription: enhancer-associated transcripts. Chromosome Res, 2013. 21(6-7): p. 713-24.
23. Raisner, R., et al., Enhancer Activity Requires CBP/P300 Bromodomain-Dependent Histone H3K27 Acetylation. Cell Rep, 2018. 24(7): p. 1722-1729.
24. Imamura, T., et al., Transcriptional Co-activators CREB-binding protein/p300 increase chondrocyte Cd-rap gene expression by multiple mechanisms including sequestration of the repressor CCAAT/enhancer-binding protein. J Biol Chem, 2005. 280(17): p. 16625-34.
25. Creyghton, M.P., et al., Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A, 2010. 107(50): p. 21931-6.
26. Hon, G., B. Ren, and W. Wang, ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLoS Comput Biol, 2008. 4(10): p. e1000201.
27. Firpi, H.A., D. Ucar, and K. Tan, Discover regulatory DNA elements using chromatin signatures and artificial neural network. Bioinformatics, 2010. 26(13): p. 1579-86.
28. Rajagopal, N., et al., RFECS: a random-forest based algorithm for enhancer identification from chromatin state. PLoS Comput Biol, 2013. 9(3): p. e1002968.
29. Lu, Y., et al., DELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications. PLoS One, 2015. 10(6): p. e0130622.
30. He, Y., et al., Improved regulatory element prediction based on tissue-specific local epigenomic signatures. Proc Natl Acad Sci U S A, 2017. 114(9): p. E1633-E1640.
31. Liu, F., et al., PEDLA: predicting enhancers with a deep learning-based algorithmic framework. Sci Rep, 2016. 6: p. 28517.
32. Visel, A., et al., VISTA Enhancer Browser--a database of tissue-specific human enhancers. Nucleic Acids Res, 2007. 35(Database issue): p. D88-92.
33. Lizio, M., et al., Update of the FANTOM web resource: expansion to provide additional transcriptome atlases. Nucleic Acids Res, 2019. 47(D1): p. D752-D758.
34. Kleftogiannis, D., P. Kalnis, and V.B. Bajic, DEEP: a general computational framework for predicting enhancers. Nucleic Acids Res, 2015. 43(1): p. e6.
35. Bu, H., et al., A new method for enhancer prediction based on deep belief network. BMC Bioinformatics, 2017. 18(Suppl 12): p. 418.
36. Yang, B., et al., BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone. Bioinformatics, 2017. 33(13): p. 1930-1936.
37. Schwessinger, R., et al., DeepC: Predicting chromatin interactions using megabase scaled deep neural networks and transfer learning. bioRxiv, 2019: p. 724005.
38. Tamminga, C.A., The human genome sequence: the human genome ii: sources of genetic variation. Am J Psychiatry, 2001. 158(5): p. 691.
39. Nelson, N.J., Draft human genome sequence yields several surprises. J Natl Cancer Inst, 2001. 93(7): p. 493.
40. Venter, J.C., et al., The sequence of the human genome. Science, 2001. 291(5507): p. 1304-51.
41. Eddy, S.R., Non-coding RNA genes and the modern RNA world. Nat Rev Genet, 2001. 2(12): p. 919-29.
42. Schneider, R. and R. Grosschedl, Dynamics and interplay of nuclear architecture, genome organization, and gene expression. Genes Dev, 2007. 21(23): p. 3027-43.
43. Alberts, B., et al., From RNA to protein, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science.
44. Alberts, B., et al., From DNA to RNA, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science.
45. Bischof, O. and R.I. Martínez‐Zamudio, MicroRNAs and lncRNAs in senescence: A re‐view. IUBMB life, 2015. 67(4): p. 255-267.
46. Cao, M.-x., et al., The crosstalk between lncRNA and microRNA in cancer metastasis: orchestrating the epithelial-mesenchymal plasticity. Oncotarget, 2017. 8(7): p. 12472.
47. Gygi, S.P., et al., Correlation between protein and mRNA abundance in yeast. Molecular and cellular biology, 1999. 19(3): p. 1720-1730.
48. Noller, H.F., Ribosomal RNA and translation. Annual review of biochemistry, 1991. 60(1): p. 191-227.
49. Brar, G.A. and J.S. Weissman, Ribosome profiling reveals the what, when, where and how of protein synthesis. Nature reviews Molecular cell biology, 2015. 16(11): p. 651-664.
50. Odom, D.T., Identification of transcription factor–DNA interactions in vivo, in A Handbook of Transcription Factors. 2011, Springer. p. 175-191.
51. Nikolov, D. and S. Burley, RNA polymerase II transcription initiation: a structural view. Proceedings of the National Academy of Sciences, 1997. 94(1): p. 15-22.
52. McPherson, C.E., et al., An active tissue-specific enhancer and bound transcription factors existing in a precisely positioned nucleosomal array. Cell, 1993. 75(2): p. 387-398.
53. Heinz, S., et al., The selection and function of cell type-specific enhancers. Nature reviews. Molecular cell biology, 2015. 16(3): p. 144-154.
54. Schwarzer, W. and F. Spitz, The architecture of gene expression: integrating dispersed cis-regulatory modules into coherent regulatory domains. Current opinion in genetics development, 2014. 27: p. 74-82.
55. Kim, T.-K. and R. Shiekhattar, Architectural and Functional Commonalities between Enhancers and Promoters. Cell, 2015. 162(5): p. 948-959.
56. Pennacchio, L.A., et al., Enhancers: five essential questions. Nat Rev Genet, 2013. 14(4): p. 288-95.
57. Kim, T.-K., et al., Widespread transcription at neuronal activity-regulated enhancers. Nature, 2010. 465(7295): p. 182-187.
58. Shlyueva, D., G. Stampfel, and A. Stark, Transcriptional enhancers: from properties to genome-wide predictions. Nature Reviews Genetics, 2014. 15(4): p. 272-286.
59. Hirschhorn, J.N. and M.J. Daly, Genome-wide association studies for common diseases and complex traits. Nature reviews genetics, 2005. 6(2): p. 95-108.
60. Symmons, O. and F. Spitz, From remote enhancers to gene regulation: charting the genome's regulatory landscapes. Philos Trans R Soc Lond B Biol Sci, 2013. 368(1620): p. 20120358.
61. Qiu, Z., J. Wang, and Y. Wu, The Landscape of Histone Modification in Cancer Metastasis. 2018.
62. Alberts, B., et al., Chromosomal DNA and its packaging in the chromatin fiber, in Molecular Biology of the Cell. 4th edition. 2002, Garland Science.
63. Cutter, A.R. and J.J. Hayes, A brief review of nucleosome structure. FEBS Lett, 2015. 589(20 Pt A): p. 2914-22.
64. Audia, J.E. and R.M. Campbell, Histone Modifications and Cancer. Cold Spring Harb Perspect Biol, 2016. 8(4): p. a019521.
65. Marino-Ramirez, L., et al., Histone structure and nucleosome stability. Expert Rev Proteomics, 2005. 2(5): p. 719-29.
66. Sadakierska-Chudy, A. and M. Filip, A comprehensive view of the epigenetic landscape. Part II: Histone post-translational modification, nucleosome level, and chromatin regulation by ncRNAs. Neurotox Res, 2015. 27(2): p. 172-97.
67. Bartova, E., et al., Histone modifications and nuclear architecture: a review. J Histochem Cytochem, 2008. 56(8): p. 711-21.
68. Lennartsson, A. and K. Ekwall, Histone modification patterns and epigenetic codes. Biochim Biophys Acta, 2009. 1790(9): p. 863-8.
69. Chrun, E.S., F. Modolo, and F.I. Daniel, Histone modifications: A review about the presence of this epigenetic phenomenon in carcinogenesis. Pathol Res Pract, 2017. 213(11): p. 1329-1339.
70. Dixon, J.R., et al., Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 2012. 485(7398): p. 376-80.
71. Cavalli, G. and T. Misteli, Functional implications of genome topology. Nat Struct Mol Biol, 2013. 20(3): p. 290-9.
72. Fudenberg, G., et al., High order chromatin architecture shapes the landscape of chromosomal alterations in cancer. Nat Biotechnol, 2011. 29(12): p. 1109-13.
73. Denker, A. and W. de Laat, The second decade of 3C technologies: detailed insights into nuclear organization. Genes Dev, 2016. 30(12): p. 1357-82.
74. de Wit, E. and W. de Laat, A decade of 3C technologies: insights into nuclear organization. Genes Dev, 2012. 26(1): p. 11-24.
75. Sexton, T., et al., Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 2012. 148(3): p. 458-72.
76. Espinoza, C.A. and B. Ren, Mapping higher order structure of chromatin domains. Nat Genet, 2011. 43(7): p. 615-6.
77. Pavlidis, P., et al., Promoter region-based classification of genes. Pac Symp Biocomput, 2001: p. 151-63.
78. Degroeve, S., et al., Feature subset selection for splice site prediction. Bioinformatics, 2002. 18 Suppl 2: p. S75-83.
79. Schölkopf, B., K. Tsuda, and J.-P. Vert, Kernel methods in computational biology. Computational molecular biology. 2004, Cambridge, Mass.: MIT Press. ix, 400 p.
80. Zheng, J., et al., Deep-RBPPred: Predicting RNA binding proteins in the proteome scale based on deep learning. Scientific reports, 2018. 8(1): p. 1-9.
81. Alipanahi, B., et al., Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature biotechnology, 2015. 33(8): p. 831-838.
82. Schneider, V.A., et al., Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome research, 2017. 27(5): p. 849-864.
83. Roadmap Epigenomics, C., et al., Integrative analysis of 111 reference human epigenomes. Nature, 2015. 518(7539): p. 317-30.
84. Downes, D., et al., An integrated platform to systematically identify causal variants and genes for polygenic human traits. 2019, bioRxiv.
85. Zhou, J. and O.G. Troyanskaya, Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods, 2015. 12(10): p. 931-4.
86. Kelley, D.R., J. Snoek, and J.L. Rinn, Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res, 2016. 26(7): p. 990-9.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15348-
dc.description.abstract增強子是一類重要的調節元件,過去許多研究中已顯示出增強子是輔助啟動子調節細胞基因表達的關鍵角色。目前,人類基因體中,增強子的數目及其在不同細胞中的活性,仍存在有許多未知。在過往的研究發現,增強子的活性與一些功能性的資料相關,例如:組蛋白修飾、序列特徵以及染色質的結構與開合程度等等。在本論文中,我們主要利用DNase以及其他組蛋白修飾數據建立了深度學習模型,並且以H3K27ac峰值作為所選細胞類型中的增強子位置進行訓練與預測。此外,本研究還設計了結合多種細胞類型的聯合訓練(Joint training),用以提高模型預測性能。透過我們所提出的深度學習模型accuEnhancer,我們展示了利用完整特徵資料集以及深度學習於預測單一種類細胞株內增強子活性的可行性,其準確性和F1可以達到0.97和0.9。為了更進一步進行跨細胞株的增強子活性預測,本論文提出通過整合來自不同細胞類型的數據來提高跨細胞類型預測的性能。隨著結合來自不同細胞類型的更多訓練數據,預測獨立細胞類型的F1從0.3上升至0.80。結果表明,通過合併更多的跨細胞類型的數據集,深度學習模型可以捕獲複雜的調控模式並提供更好的性能。最後,本研究測試了accuEnhancer模型在預測VISTA實驗驗證的增強子數據庫的有效性。結果顯示accuEnhancer在預測經過實驗驗證的增強子能勝過前人的其他方法,本論文因此探討了跨細胞株,乃至於跨物種預測的可行性。zh_TW
dc.description.abstractEnhancers are one class of the regulatory elements that have been shown to act as key components to assist promoters in modulating the gene expression in living cells. At present, the number of enhancers in the human genome as well as their activities in different cell types are still largely unknown. Previous studies have shown that enhancer activities are associated with some functional data, such as histone modification, sequence motifs, and chromatin accessibilities. This study utilized DNase data to build a deep learning model for predicting the H3K27ac peaks as the active enhancers. Moreover, this thesis proposed joint training of multiple cell types to boost the model performance. The analyses conducted in this thesis first demonstrated the general feasibility of accuEnhancer to predict within-cell type enhancer activities, where the accuracy and the F1 score can achieve 0.97 and 0.9, respectively.
To further predict cell type-specific enhancers by cross-cell type modeling, we integrated the training data from different cell types to boost the model performance. The F1 score increased from 0.3 to 0.80 as the model combined more training data from different cell types. The results demonstrated that by incorporating more datasets across cell types, the complex regulatory patterns could be captured by the deep neural networks to deliver better performances. Lastly, this study tested the effectiveness of the model on predicting experimentally validated enhancers in the VISTA database. The results indicated that accuEnhancer outperforms the previous works in predicting cell type-specific enhancer activities by cross-cell type modeling.
en
dc.description.provenanceMade available in DSpace on 2021-06-07T17:33:07Z (GMT). No. of bitstreams: 1
U0001-2906202023270600.pdf: 12683118 bytes, checksum: aace81bae36059071749f1ff82f5258a (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents
中文摘要 1
Abstract 2
Table of Contents 4
List of Figures 6
List of Tables 7
1. Introduction 8
2. Background 10
2.1. The genome 10
2.1.1. Transcription and translation 11
2.1.2. Transcription factors 12
2.1.3. Enhancers. 12
2.2. The epigenome 13
2.2.1. Histone modifications 13
2.2.2. Chromatin structure 14
3. Materials and Methods 14
3.1. Machine learning concepts and bioinformatics applications 14
3.1.1. Feature selection 15
3.2. Deep learning on biological data 15
3.3. Genomic sequences 16
3.4. ENCODE database 16
3.5. Construction of the positive and negative sets 17
3.6. DeepHaem model backbones 19
3.7. The architecture of accuEnhancer 20
3.8. Predicting within cell type active enhancers 21
3.9. Predicting cross-cell type active enhancers 21
3.10. VISTA enhancer browser 22
4. Results 22
4.1. Predicting within cell type active enhancers 22
4.2. impact of utilizing pretrained weights 25
4.3. Evaluation of feature importance 26
4.4. Predicting the cross-cell type active enhancers 28
4.5. Integration data from multiple cell types helps to identify novel active enhancers 31
5. Conclusion 36
6. References 38
7. Appendices 46
dc.language.isoen
dc.title利用深度學習建構跨細胞株模型預測增強子之細胞株特異性活性zh_TW
dc.titlePredicting cell type-specific enhancer activities by cross-cell type modeling with deep learningen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree博士
dc.contributor.coadvisor歐陽彥正(Yen-Jen Oyang)
dc.contributor.oralexamcommittee阮雪芬(Hsueh-Fen Juan),黃宣誠(Hsuan-Cheng Huang),蔡懷寬(Huai-Kuang Tsai)
dc.subject.keyword機器學習,深度學習,增強子,基因調控網路,跨細胞株預測,增強子預測模型,zh_TW
dc.subject.keywordaccuEnhancer,Machine learning,deep learning,enhancer,Gene regulatory network,cross-cell type prediction,en
dc.relation.page52
dc.identifier.doi10.6342/NTU202001200
dc.rights.note未授權
dc.date.accepted2020-07-01
dc.contributor.author-college生命科學院zh_TW
dc.contributor.author-dept基因體與系統生物學學位學程zh_TW
dc.date.embargo-terms2030-01-01
Appears in Collections:基因體與系統生物學學位學程

Files in This Item:
File SizeFormat 
U0001-2906202023270600.pdf
  Restricted Access
12.39 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved