Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19819
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor歐陽彥正
dc.contributor.authorPo-Chun Wuen
dc.contributor.author吳柏均zh_TW
dc.date.accessioned2021-06-08T02:21:06Z-
dc.date.copyright2015-08-21
dc.date.issued2015
dc.date.submitted2015-08-20
dc.identifier.citation[1] S. Ahmad, M. M. Gromiha, and A. Sarai. Analysis and prediction of dna-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics, 20(4):477–486, 2004.
[2] T.L.Bailey,N.Williams,C.Misleh,andW.W.Li.Meme:discoveringandanalyzing dna and protein sequence motifs. Nucleic acids research, 34(suppl 2):W369–W373, 2006.
[3] A. Barski, S. Cuddapah, K. Cui, T.-Y. Roh, D. E. Schones, Z. Wang, G. Wei, I. Che- pelev, and K. Zhao. High-resolution profiling of histone methylations in the human genome. Cell, 129(4):823–837, 2007.
[4] X. Chen, H. Xu, P. Yuan, F. Fang, M. Huss, V. B. Vega, E. Wong, Y. L. Orlov, W. Zhang, J. Jiang, et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell, 133(6):1106–1117, 2008.
[5] . G. P. Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature, 491(7422):56–65, 2012.
[6] F. Crick et al. Central dogma of molecular biology. Nature, 227(5258):561–563, 1970.
[7] F. H. Crick, J. S. Griffith, and L. E. Orgel. Codes without commas. Proceedings of the National Academy of Sciences of the United States of America, 43(5):416, 1957.
[8] M. Gao and J. Skolnick. Dbd-hunter: a knowledge-based method for the prediction of dna–protein interactions. Nucleic acids research, 36(12):3978–3992, 2008.
[9] N. Guillon, F. Tirode, V. Boeva, A. Zynovyev, E. Barillot, and O. Delattre. The oncogenic ews-fli1 protein binds in vivo ggaa microsatellite sequences with potential transcriptional activation function. PLoS One, 4(3), 2009.
[10] T. Li, Q.-Z. Li, S. Liu, G.-L. Fan, Y.-C. Zuo, and Y. Peng. Predna: accurate predic- tion of dna-binding sites in proteins by integrating sequence and geometric structure information. Bioinformatics, 29(6):678–685, 2013.
[11] C.-K. Lin and C.-Y. Chen. Pidna: predicting protein–dna interactions with structural models. Nucleic acids research, 41(W1):W523–W530, 2013.
[12] A. Mathelier, X. Zhao, A. W. Zhang, F. Parcy, R. Worsley-Hunt, D. J. Arenillas, S. Buchman, C.-y. Chen, A. Chou, H. Ienasescu, et al. Jaspar 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic acids research, page gkt997, 2013.
[13] E. Portales-Casamar, S. Thongjuea, A. T. Kwon, D. Arenillas, X. Zhao, E. Valen, D. Yusuf, B. Lenhard, W. W. Wasserman, and A. Sandelin. Jaspar 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic acids research, page gkp950, 2009.
[14] N. Siva. 1000 genomes project. Nature biotechnology, 26(3):256–256, 2008.
[15] G. Tuteja, P. White, J. Schug, and K. H. Kaestner. Extracting transcription factor
targets from chip-seq data. Nucleic acids research, 37(17):e113–e113, 2009.
[16] A. Valouev, D. S. Johnson, A. Sundquist, C. Medina, E. Anton, S. Batzoglou, R. M. Myers, and A. Sidow. Genome-wide analysis of transcription factor binding sites based on chip-seq data. Nature methods, 5(9):829–834, 2008.
[17] D. Vlieghe, A. Sandelin, P. J. De Bleser, K. Vleminckx, W. W. Wasserman, F. Van Roy, and B. Lenhard. A new generation of jaspar, the open-access repos- itory for transcription factor binding site profiles. Nucleic acids research, 34(suppl 1):D95–D97, 2006.
[18] W.-J. Welboren, M. A. van Driel, E. M. Janssen-Megens, S. J. van Heeringen, F. C. Sweep, P. N. Span, and H. G. Stunnenberg. Chip-seq of erα and rna polymerase ii defines genes differentially responding to ligands. The EMBO journal, 28(10):1418– 1428, 2009.
[19] Y.Zhang,T.Liu,C.A.Meyer,J.Eeckhoute,D.S.Johnson,B.E.Bernstein,C.Nus- baum, R. M. Myers, M. Brown, W. Li, et al. Model-based analysis of chip-seq (macs). Genome biology, 9(9):R137, 2008.
[20] H. Zhao, Y. Yang, and Y. Zhou. Structure-based prediction of dna-binding proteins by structural alignment and a volume-fraction corrected dfire-based energy function. Bioinformatics, 26(15):1857–1863, 2010.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19819-
dc.description.abstract基因調控 (gene regulation) 是維持細胞運作不可或缺的一個重要機制。因此,生物系統如何調控基因的表現一直是科學家們很重視的一個研究主題。基因調控細胞運作可以分成很多層面,包含控制基因表現、mRNA轉錄及剪接、蛋白質的後修飾作用等等。而其中,本論文主要想探討的是透過轉錄因子(transcription factor) 與雙股DNA 之間的相互作用,對基因表現(gene expression)產生活化或抑制之影響。在人類基因組的30億個鹼基當中,目前所知具有生物意義的片段如基因、轉錄因子結合位(transcription factor binding site, TFBS)等等僅佔DNA 的一小部分,而其中,轉錄因子如何辨識其結合位,進而達到基因調控的目的是非常重要的研究議題;轉錄因子與DNA結合的片段大小大約為5∼15個核苷酸,轉錄因子與其結合位的鍵結強弱亦可能影響其調控目標基因的基因表現。
1990年代,人類基因組定序計畫 (Human Genome Project) 啟動,受限於當時的技術,在投入大量資金與人力後,終於在2001年完成人類23對染色體共30億個鹼基的定序草圖,此為人類基因體學上一重大里程碑。隨著生物技術的發展及電腦計算成本的下降,基因定序技術的演進也一日千里。2008年,千人基因組計畫 (1000 Genome Project) 啟動,計畫在三年內利用更快、更便利的定序技術,完成超過一千人的基因組定序。2012年,共1,092組的基因組定序結果發表;迄今,該計畫最新公開之資料組已包含2,504人之基因組定序資料。人類基因組草圖的完成,足以讓研究人員開始針對轉錄因子的結合位進行高通量的篩選,越來越多的個人基因組資訊,更提供豐富的研究題材讓我們能一窺轉錄因子結合位的個體差異。本論文的研究目的即在利用千人基因組資料,探討轉錄因子結合位之個體化差異,並探討其在未來基因檢測之可能性。
本論文蒐集 JASPAR 公開資料庫之34個人類轉錄因子之結合位資料,結合千人基因組計畫的序列變異資料,探討轉錄因子結合位的個體化差異。本論文所得之分析數據顯示,JASPAR所標示的轉錄因子結合位,僅有約3%的鹼基位置有觀察到個體差異,而有個體差異的位置,並不完全符合原先對該轉錄因子結合位的特徵描述,有些個體差異是發生在特徵描述上暗示不容許發生變異的位置。為解釋此不一致的現象,本論文利用線上工具PiDNA,透過蛋白質DNA複合物(protein-DNA complex)預測所得之轉錄因子結合位特徵,探討可能被忽略的轉錄因子結合位序列次級樣式(minor form)。本論文最後探討未來於個人基因診斷應用時,如何利用現有生物資訊工具與公開資料庫的資訊,評估發生於轉錄因子結合位之序列變異的重要性,期望能提供未來個體化醫療之相關應用作為參考。
zh_TW
dc.description.abstractGene regulation is essential and important for maintaining cellular functions. Therefore, how biological system regulates gene expression is a very important research topic for researchers. Gene regulation of cell functioning can be divided into many parts, including gene expression, mRNA transcription and splicing, post-translational modification, etc. This study aims at exploring the activation and inactivation effect of gene expression, through the interaction between transcription factors and double-stranded DNA. Among the three billion base pairs of human genome, some biological significant fragments such as genes or transcription factor binding sites account for only a small portion of DNA. The size of transcription factor binding motifs is about 5 to 15 nucleotides. Accordingly, how to identify transcription factor binding sites and how they achieve gene regulation is a very important research issue. Meanwhile, the bonding strength between transcription factors and their binding sites may also affect the regulation of gene expression.
In the 1990s, the Human Genome Sequencing Project launched. Limited to the technology at that time, this project spent a lot of money and manpower. Finally, 23 human chromosomes were completed sequencing in 2001, including in total three billion bases. This is a considerable milestone on human genome research. With the development of biotechnology and the reducing cost of computer calculation, the technology of genome sequencing started to grow fast. In 2008, the 1000 Genomes Project started, planning to use faster and easier sequencing technology, to sequencing more than a thousand human genomes within three years. In 2012, in total 1,092 human genomes have been published. So far, the latest version dataset of this project has already contained 2,504 human genome data. The completion of human genome allows researchers to perform high-throughput screening of transcription factor binding sites. More and more individual genome datasets, provided a wealth of research themes letting us to glimpse the differences within individual transcription factor binding sites. The objective of this study is using the data of 1000 Genomes Project to explore individual variations in transcription factor binding sites, and the possibilities of its applications on genetic tests.
This study collected the binding site data of 34 human transcription factors in the JASPAR database, and combined this information with the variant data of the 1000 Genomes Project to explore individual variations in transcription factor binding sites. Analysis from the study shows, the JASPAR-denoted transcription factor binding sites have only about 3% of position with individual variations. Furthermore, the positions with individual variations do not consistent with the original motifs of the transcription factor binding sites. Some individual variations occur at the positions where the corresponding motif implies not allowing variations.
In order to further investigate the rationale behind this inconsistency, this study used an online tool named PiDNA, which predicts the binding motif of a DNA-binding protein using protein-DNA complex structures. This study employed such binding motifs to explore the potential minor form that might be omitted previously. At the end of this study, it discusses the future application of personal genetic diagnosis, and how to use existing bioinformatics tools and public databases to assess the importance of the occurrence of variants observed in transcription factor binding sites. It is expected that this study can provide novel insights for individual genetic tests in the personalized medecine.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T02:21:06Z (GMT). No. of bitstreams: 1
ntu-104-R00945020-1.pdf: 5059317 bytes, checksum: 864c360ee586ee15d56bd35863bf271c (MD5)
Previous issue date: 2015
en
dc.description.tableofcontents口試委員會審定書 i
摘要 ii
Abstract iv
第一章、前言 1
第二章、文獻探討 3
2.1 分子生物學中心法則 3
2.1.1 轉錄因子 3
2.1.2 啟動子 4
2.2 共免疫沉澱 -定序技術 4
2.3 千人基因組計畫 5
2.4 JASPAR 資料庫 6
2.5 PiDNA 蛋白質結合位特徵預測工具 7
第三章、研究方法與資料 10
3.1 JASPAR 人類轉錄因子資料 10
3.2 千人基因組資料 10
3.3 參考基因組資料 12
3.4 資訊熵 13
3.5 位置頻率矩陣相對熵 13
第四章、結果與討論 15
4.1 基本資料統計 15
4.1.1 千人基因組計畫資料之統計 15
4.1.2 JASPAR 提供之轉錄因子結合位資料 17
4.1.3 JASPAR 轉錄因子結合位的變異狀況 19
4.2 轉錄因子結合位資訊熵之相加 22
4.3 各樣本轉錄因子結合位特徵之比較 26
4.4 轉錄因子結合位特徵於特定鹼基位置跨樣本的變異 27
4.5 進一步探討資訊熵相衝突的轉錄因子結合位 27
4.5.1 轉錄因子 ELK4 資訊熵相衝突的轉錄因子結合位 32
第五章、結論 35
參考文獻 36
dc.language.isozh-TW
dc.title利用千人基因組資料探討轉錄因子結合位個體化差異zh_TW
dc.titleInvestigating Variations of Transcription Factor Binding Sites by 1000 Genomes Dataen
dc.typeThesis
dc.date.schoolyear103-2
dc.description.degree碩士
dc.contributor.coadvisor陳倩瑜
dc.contributor.oralexamcommittee陳沛隆,蘇中才
dc.subject.keyword轉錄因子,轉錄因子結合位,轉錄因子結合位特徵,個體化差異,千人基因組計畫,zh_TW
dc.subject.keywordtranscription factor,transcription factor binding site,transcription factor binding motif,1000 Genomes Project,en
dc.relation.page38
dc.rights.note未授權
dc.date.accepted2015-08-20
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept生醫電子與資訊學研究所zh_TW
顯示於系所單位:生醫電子與資訊學研究所

文件中的檔案:
檔案 大小格式 
ntu-104-1.pdf
  目前未授權公開取用
4.94 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved