利用蛋白質支鏈與DNA鹼基之相對空間幾何特性預測蛋白質與DNA之結合軌跡

Chien-Chih Wang; 王建智

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43490

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳倩瑜(Chien-Yu Chen)
dc.contributor.author	Chien-Chih Wang	en
dc.contributor.author	王建智	zh_TW
dc.date.accessioned	2021-06-15T02:22:21Z	-
dc.date.available	2014-08-20
dc.date.copyright	2009-08-20
dc.date.issued	2009
dc.date.submitted	2009-08-18
dc.identifier.citation	Breslauer, K., R. Frank, H. Blocker, and L. Marky, 1986. Predicting DNA Duplex Stability from the Base Sequence, Proceedings of the National Academy of Sciences. 83(11): p. 3746-3750. Bruschweiler, R., 2003. Efficient RMSD measures for the comparison of two molecular ensembles, PROTEINS-NEW YORK-. 50(1): p. 26-34. Diekmann, S., 1989. Definitions and nomenclature of nucleic acid structure parameters, EMBO journal(Print). 8(1): p. 1-4. Gao, M. and J. Skolnick, 2008. DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions, Nucleic Acids Research. 36(12): p. 3978. Gorin, A., V. Zhurkin, and K. Wilma, 1995. B-DNA Twisting Correlates with Base-pair Morphology, Journal of Molecular Biology. 247(1): p. 34-48. Hotelling, H., 1933. Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology. 24(6): p. 417-441. Hsu, C., C. Chen, and B. Liu, 2006. MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences, Nucleic Acids Research. 34(Web Server issue): p. W356. Johnson, S., 1967. Hierarchical clustering schemes, Psychometrika. 32(3): p. 241-254. Kabsch, W., 1976. A solution for the best rotation to relate two sets of vectors, Crystal Physics, Diffraction, Theoretical and General Crystallography. 32(5): p. 7394. Kabsch, W. and C. Sander, 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers. 22(12). Li, Z. and H. Scheraga, 1987. Monte Carlo-minimization approach to the multiple-minima problem in protein folding, Proceedings of the National Academy of Sciences. 84(19): p. 6611-6615. Luscombe, N., R. Laskowski, and J. Thornton, 2001. Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level, Nucleic Acids Research. 29(13): p. 2860. Miller, J., 1999. Vector geometry for computer graphics, Computer Graphics and Applications, IEEE. 19(3): p. 66-73. Olson, W., A. Gorin, X. Lu, L. Hock, and V. Zhurkin, DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. 1998, National Acad Sciences. p. 11163-11168. Orr, M., 1996. Introduction to radial basis function networks, Center for Cognitive Science, Scotland, UK. Pearson, K., 1901. On lines and planes of closest fit to systems of points in space, Philosophical Magazine. 2(6): p. 559-572. Sayle, R. and E. Milner-White, 1995. RASMOL: biomolecular graphics for all, Trends in Biochemical Sciences. 20(9): p. 374-376. Sokal, R. and P. Sneath, Principles of numerical taxonomy. 1963: WH Freeman San Francisco. Tompa, M., N. Li, T. Bailey, G. Church, B. De Moor, E. Eskin, A. Favorov, M. Frith, Y. Fu, and W. Kent, 2005. Assessing computational tools for the discovery of transcription factor binding sites, Nature Biotechnology. 23: p. 137-144. van Dijk, M., A. van Dijk, V. Hsu, R. Boelens, and A. Bonvin, 2006. Information-driven protein-DNA docking using HADDOCK: it is a matter of flexibility, Nucleic acids research. 34(11): p. 3317.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43490	-
dc.description.abstract	許多蛋白質利用辨識特定或非特定DNA序列片段實現它的功能，基於此重要性，我們經常使用計算方法來尋找可能與DNA結合的胺基酸。現今，我們可以從蛋白質結構資料庫Protein Data Bank (PDB)取得許多蛋白質與DNA共結晶之複合物，來幫助我們進一步了解這些蛋白質如何辨識特定核苷酸序列。這些資訊對於生物學家預測某些特定蛋白質於DNA序列中的結合位置將有莫大的幫助，例如：尋找轉錄因子(transcription factors)之結合位置(transcription factor binding sites)。在未來，這些資料將能幫助我們更正確地了解基因調控與基因調控網路。雖然目前結構資料庫中有許多蛋白質與DNA複合物可以讓我們清楚了解蛋白質與DNA結合時之相互幾何關係，但如果在沒有蛋白質和DNA複合物的資料下，想要直接預測蛋白質和DNA的結合機制是一項非常困難的工作。本篇論文期望能提出一個給定已知的蛋白質結構就可以預測DNA結合位置與方位之演算法；首先，我們利用序列特徵探勘工具(MAGIIC-PRO)從給定蛋白質序列之相關序列中找出保留性區域，藉此探索此蛋白質的功能性區域。在功能性區域被找到之後，我們進一步篩選出表面胺基酸，再從此子集合中利用分群演算法篩出最有可能與DNA結合之胺基酸群，進而應用主成分分析(Principal Component Analysis， PCA)於這些原子的座標用來預測DNA分子的凹槽方向。實驗結果顯示，此論文所提之方法可以成功的預測出所選定的功能性區域附近的DNA凹槽方向；而且，我們經由一個以徑向基底函數為核心的評分函數可成功預測出空間中最容易出現鹼基的位置。相信本論文所提出的方法之輸出資訊將有效幫助更進一步的蛋白質-DNA互動分析研究，像是蛋白質-DNA嵌合模擬與預測轉錄因子結合位置。	zh_TW
dc.description.abstract	DNA-binding proteins reveal their functions through specific or non-specific protein-DNA recognition. Identifying DNA-binding residues with computational tools facilitates predicting or validating protein functions at a high-throughput rate. The protein-DNA complexes available in Protein Data Bank (PDB) further unveils how a DNA-binding protein recognizes its partners. Such information greatly helps biologists to determine or predict the binding elements in DNA sequences such as transcription factor binding sites (TFBSs). In this way, accurate regulatory networks in whole-genome scale can be constructed more efficiently in the near future. While it remains a challenging task to understand the mechanism of protein-DNA interactions without crystal complex structures, this thesis proposes an algorithm to predict the binding position and direction of DNA when given a known protein structure. First, potential DNA-binding regions of a query protein is predicted by a sequential pattern mining software, MAGIIC-PRO, which identifies functional regions of a protein by discovering concurrent conserved regions among its related protein sequences. After functional regions are predicted, we extract the residues in the protein surface and use hierarchical clustering algorithm to derive potential DNA-binding units, compact conserved regions with high DNA-binding propensity. Afterward, principal component analysis (PCA) is applied on the collected atoms to predict the orientation of DNA grooves. In order to derive the positions where the DNA bases like to be present, we propose a knowledge-based learning procedure to construct a predicting model that considers geometric propensity between protein side chains and DNA bases. The experiments conducted in the thesis reveal that we can predict the orientation of the DNA grooves around the selected conserved regions with satisfied errors. Furthermore, with a well-designed scoring function that incorporates radius basis function (RBF) as the kernel, we build spatial distributions of the positions where DNA bases likes to be present. The computational outputs are expected to provide useful information for many of the next-step analyses such as protein-DNA docking and TFBS predictions.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T02:22:21Z (GMT). No. of bitstreams: 1 ntu-98-R96631012-1.pdf: 5197046 bytes, checksum: 2b03d0b6c4f5f45a15f4516017300b54 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	Abstract i 中文摘要 iii Table of contents v Table of Figures vii Table of tables x 1. Introduction 1 2. Literature Review 5 2.1. Protein-DNA Interactions 5 2.2. Principal Component Analysis 6 2.3. Vector Geometry 7 2.4. Radius Basis Function 8 2.5. Hierarchical clustering 8 3. Materials and Methods 10 3.1. Datasets 11 3.2. Learning geometric binding propensity of bases for each amino acid 13 3.3. Extracting potential DNA-binding residues 16 3.4. Predicting DNA route on proteins 18 3.6. Scoring Function for base selection 21 3.7. Removing similar bases from sampling space 22 4. Results and Discussion 25 5. Conclusion 49 Reference 51 Appendix A 53 Appendix B 64
dc.language.iso	en
dc.subject	蛋白質與DNA之交互作用	zh_TW
dc.subject	DNA鍵結位置	zh_TW
dc.subject	鍵結走向	zh_TW
dc.subject	以結構為基礎之預測	zh_TW
dc.subject	DNA-binding sites	en
dc.subject	protein-DNA interactions	en
dc.subject	structure-based prediction	en
dc.subject	binding orientation	en
dc.title	利用蛋白質支鏈與DNA鹼基之相對空間幾何特性預測蛋白質與DNA之結合軌跡	zh_TW
dc.title	Tracking DNA route on protein structure by knowledge-based learning considering geometric propensity between side chains and bases	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	蘇中才,張天豪
dc.subject.keyword	DNA鍵結位置,鍵結走向,以結構為基礎之預測,蛋白質與DNA之交互作用,	zh_TW
dc.subject.keyword	DNA-binding sites,binding orientation,structure-based prediction,protein-DNA interactions,	en
dc.relation.page	68
dc.rights.note	有償授權
dc.date.accepted	2009-08-19
dc.contributor.author-college	生物資源暨農學院	zh_TW
dc.contributor.author-dept	生物產業機電工程學研究所	zh_TW
顯示於系所單位：	生物機電工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 未授權公開取用	5.08 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。