以去氧核糖核酸作用之專一性及非專一性結合殘基預測結果為基礎進而推論蛋白質序列上蛋白質-核酸結合類型

Chun-Chin Huang; 黃俊欽

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23072

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	黃乾綱
dc.contributor.author	Chun-Chin Huang	en
dc.contributor.author	黃俊欽	zh_TW
dc.date.accessioned	2021-06-08T04:40:32Z	-
dc.date.copyright	2009-08-14
dc.date.issued	2009
dc.date.submitted	2009-08-13
dc.identifier.citation	1. Calkhoven CF, Ab G: Multiple steps in the regulation of transcription-factor level and activity. Biochem J 1996, 317 ( Pt 2):329-342. 2. Pabo CO, Sauer RT: Transcription factors: structural families and principles of DNA recognition. Annu Rev Biochem 1992, 61:1053-1095. 3. Latchman DS: Transcription factors: an overview. Int J Biochem Cell Biol 1997, 29(12):1305-1312. 4. Latchman DS: Transcription factors: an overview. Int J Exp Pathol 1993, 74(5):417-422. 5. Tsuchiya Y, Kinoshita K, Nakamura H: Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins 2004, 55(4):885-894. 6. Tjong H, Zhou HX: DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Res 2007, 35(5):1465-1477. 7. Jones S, Shanahan HP, Berman HM, Thornton JM: Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res 2003, 31(24):7189-7198. 8. Ofran Y, Mysore V, Rost B: Prediction of DNA-binding residues from sequence. Bioinformatics 2007, 23(13):i347-353. 9. Hwang S, Gou Z, Kuznetsov IB: DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins. Bioinformatics 2007, 23(5):634-636. 10. Yan C, Terribilini M, Wu F, Jernigan RL, Dobbs D, Honavar V: Predicting DNA-binding sites of proteins from amino acid sequence. BMC Bioinformatics 2006, 7:262. 11. Wang L, Brown SJ: Prediction of DNA-binding residues from sequence features. J Bioinform Comput Biol 2006, 4(6):1141-1158. 12. Ahmad S, Sarai A: PSSM-based prediction of DNA binding sites in proteins. BMC Bioinformatics 2005, 6:33. 13. Wu J, Liu H, Duan X, Ding Y, Wu H, Bai Y, Sun X: Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature. Bioinformatics 2009, 25(1):30-35. 14. Liu J, Perumal NB, Oldfield CJ, Su EW, Uversky VN, Dunker AK: Intrinsic disorder in transcription factors. Biochemistry 2006, 45(22):6873-6888. 15. McGhee JD, von Hippel PH: Theoretical aspects of DNA-protein interactions: co-operative and non-co-operative binding of large ligands to a one-dimensional homogeneous lattice. J Mol Biol 1974, 86(2):469-489. 16. Von Hippel PH, McGhee JD: DNA-protein interactions. Annu Rev Biochem 1972, 41(10):231-300. 17. von Hippel PH: Protein-DNA recognition: new perspectives and underlying themes. Science 1994, 263(5148):769-770. 18. Gao M, Skolnick J: DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions. Nucleic Acids Res 2008, 36(12):3978-3992. 19. Chu WY, Huang YF, Huang CC, Cheng YS, Huang CK, Oyang YJ: ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors. Nucleic Acids Res 2009, 37(Web Server issue):W396-401. 20. Luscombe NM, Austin SE, Berman HM, Thornton JM: An overview of the structures of protein-DNA complexes. Genome Biol 2000, 1(1):REVIEWS001. 21. Reddy CK, Das A, Jayaram B: Do water molecules mediate protein-DNA recognition? J Mol Biol 2001, 314(3):619-632. 22. Branden C-I, Tooze J: Introduction to Protein Structure: Garland Publishing; 1999. 23. Kalodimos CG, Biris N, Bonvin AM, Levandoski MM, Guennuegues M, Boelens R, Kaptein R: Structure and flexibility adaptation in nonspecific and specific protein-DNA complexes. Science 2004, 305(5682):386-389. 24. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S et al: The Protein Data Bank. Acta Crystallogr D Biol Crystallogr 2002, 58(Pt 6 No 1):899-907. 25. Berman HM, Olson WK, Beveridge DL, Westbrook J, Gelbin A, Demeny T, Hsieh SH, Srinivasan AR, Schneider B: The nucleic acid database. A comprehensive relational database of three-dimensional structures of nucleic acids. Biophys J 1992, 63(3):751-759. 26. Luscombe NM, Thornton JM: Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity. J Mol Biol 2002, 320(5):991-1009. 27. Luscombe NM, Laskowski RA, Thornton JM: Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res 2001, 29(13):2860-2874. 28. Bhardwaj N, Langlois R, Zhao G, Lu H: Structure Based Prediction of Binding Residues on DNA-binding Proteins. Conf Proc IEEE Eng Med Biol Soc 2005, 3:2611-2614. 29. Chen YC, Wu CY, Lim C: Predicting DNA-binding amino acid residues from electrostatic stabilization upon mutation to Asp/Glu and evolutionary conservation. Proteins 2007, 67(3):671-680. 30. Bhardwaj N, Lu H: Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions. FEBS Lett 2007, 581(5):1058-1066. 31. Ahmad S, Gromiha MM, Sarai A: Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 2004, 20(4):477-486. 32. Wang L, Brown SJ: BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res 2006, 34(Web Server issue):W243-248. 33. Chang C-C, Lin C-J: LIBSVM: a library for support vector machines. 2001:Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. 34. Gewehr JE, Zimmer R: SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles. Bioinformatics 2006, 22(2):181-187. 35. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000, 25(1):25-29. 36. Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL et al: The Pfam protein families database. Nucleic Acids Res 2008, 36(Database issue):D281-288. 37. McGuffin LJ, Bryson K, Jones DT: The PSIPRED protein structure prediction server. Bioinformatics 2000, 16(4):404-405. 38. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402. 39. Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22(12):2577-2637.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23072	-
dc.description.abstract	蛋白質和DNA的交互作用通常牽涉到DNA的轉錄、複製、遺傳訊息傳送、或是基因重組等重要生化作用。而蛋白質與DNA的結合特性又可分為序列專一性結合以及非專一性結合。序列專一性結合能夠去辨識特定的DNA鹼基對部份；另一方面，非專一性結合主要是與DNA的醣基-磷酸部份進行反應。本論文第一階段在討論結合殘基預測。對具序列專一性結合殘基的預測，分類預測器能夠達到96.45%的精確度、50.14%的靈敏度、99.31%的專一性、以及81.70%的準確度和高達62.15%的F型測量值；而非專一性結合預測器可達到89.14%的精確度、53.06%的靈敏度、95.25%的專一性、以及65.47%的準確性和高達58.62%的F型測量值。此外，我們將兩項預測結果進行OR運算後，可獲得89.26%的精確度、56.86%的靈敏度、95.63%的專一性、以及71.92%的準確性和63.51%的F型測量值。論文第二階段則探討蛋白質-DNA結合模式的預測，所設計的多類型分類的支援向量機可達到75.83%的精確度。本論文研究呈現了以序列資訊為基礎的預測分類器，且該分類器能夠針對與DNA結合機制有關的轉錄因子，預測序列專一性結合殘基以及非專一性結合殘基。而發展蛋白質-DNA結合型態的預測器，其目標是希望能夠提供生化學者額外的結構預測資訊，並進一步提升殘基的預測表現。此外，我們也從本實驗中學習相關經驗，將經驗應用在轉錄因子以外的蛋白質類型的結合性殘基預測。	zh_TW
dc.description.abstract	Protein-DNA interactions are essential for fundamental biochemical activities including DNA transcription, replication, packaging, repair and rearrangement. Proteins interacting with DNA can be classified into two modes distinguished by sequence-specific and non-specific binding respectively. Protein-DNA specific binding provides a mechanism to recognize correct nucleotide base pairs namely sequence-specific identification. On the other hand, protein-DNA non-specific binding shows relatively little base-sequence preference and interacts with DNA backbone. In this thesis, we present a two stage Protein-DNA binding prediction. In the first stage of DNA-binding residues prediction, the predictor for DNA specific binding residues achieves 96.45% accuracy with 50.14% sensitivity, 99.31% specificity, 81.70% precision, and 62.15% F-measure. The predictor for DNA non-specific binding residues achieves 89.14% accuracy with 53.06% sensitivity, 95.25% specificity, 65.47% precision, and 58.62% F-measure. In addition, we combine the results of sequence-specific and non-specific binding residues predicted in previous stage with OR operation, and the predictor achieves 89.26% accuracy with 56.86% sensitivity, 95.63% specificity, 71.92% precision, and 63.51% F-measure. In the second stage, a protein-DNA interaction mode predictor is proposed. It can achieve 75.83% accuracy while using support vector machine with multi-class prediction. This article presents the design of a sequence-based predictor aiming to identify the sequence-specific and non-specific DNA-binding residues in a transcription factor with DNA binding-mechanism concerned. The protein-DNA interaction mode prediction was introduced to provide biochemist more structural hint and help improve previous DNA-binding residues prediction. In addition, we will exploit the experiences learned in this study to design binding-mechanism concerned predictors for other types of DNA-contacted proteins.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T04:40:32Z (GMT). No. of bitstreams: 1 ntu-98-R96525072-1.pdf: 1856387 bytes, checksum: 46e0cf1c704a501b5a8828df17b9519e (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	誌謝 i 摘要 ii ABSTRACT iii 專有名詞對照 v 目錄 vi 圖目錄 viii 表目錄 x Chapter1 導論 1 Chapter2 相關工作 5 2.1 專一性結合與非專一性結合 5 2.2 預測方法之相關文獻探討 8 2.3 資料集合 (Dataset) 的取得 10 2.4 定義結合性殘基 12 2.5 定義蛋白質-DNA結合型態 13 2.6 分類器套件—LIBSVM 18 2.7 其它工具及專有名詞 21 Chapter3 實驗方法 25 3.1 實驗架構與分類器 25 3.2 特徵選取與向量編碼 27 3.3 資料正規化處理 30 3.4 驗證方法 30 3.5 獨立測試 31 Chapter4 實驗結果與討論 34 4.1 最佳參數選取 34 4.2 表現評估 34 Chapter5 結論 49 文獻參考 53 附錄 58
dc.language.iso	zh-TW
dc.subject	轉錄因子	zh_TW
dc.subject	結合性殘基預測	zh_TW
dc.subject	序列專一性結合	zh_TW
dc.subject	非專一性結合	zh_TW
dc.subject	支援向量機	zh_TW
dc.subject	sequence-specific binding	en
dc.subject	transcription factor	en
dc.subject	support vector machine	en
dc.subject	non-specific binding	en
dc.subject	DNA-binding residues prediction	en
dc.title	以去氧核糖核酸作用之專一性及非專一性結合殘基預測結果為基礎進而推論蛋白質序列上蛋白質-核酸結合類型	zh_TW
dc.title	Prediction of Transcription Factor Domain based on Analysis of Specific and non-Specific DNA-Binding Residues on the Protein Sequence	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	歐陽彥正,張天豪,張瑞益
dc.subject.keyword	結合性殘基預測,序列專一性結合,非專一性結合,支援向量機,轉錄因子,	zh_TW
dc.subject.keyword	DNA-binding residues prediction,sequence-specific binding,non-specific binding,support vector machine,transcription factor,	en
dc.relation.page	60
dc.rights.note	未授權
dc.date.accepted	2009-08-13
dc.contributor.author-college	工學院	zh_TW
dc.contributor.author-dept	工程科學及海洋工程學研究所	zh_TW
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 未授權公開取用	1.81 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。