Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41987
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor歐陽彥正(Yen-Jen Oyang)
dc.contributor.authorTing-Ying Chienen
dc.contributor.author簡廷因zh_TW
dc.date.accessioned2021-06-15T00:40:40Z-
dc.date.available2008-09-02
dc.date.copyright2008-09-02
dc.date.issued2008
dc.date.submitted2008-08-26
dc.identifier.citation1. Friedberg, I. (2006) Automated protein function prediction - the genomic challenge. Briefings in Bioinformatics, 7, 225-242.
2. Chandonia, J.M. and Brenner, S.E. (2006) The impact of structural genomics: Expectations and outcomes. Science, 311, 347-351.
3. Watson, J.D., Laskowski, R.A. and Thornton, J.M. (2005) Predicting protein function from sequence and structural data. Current Opinion in Structural Biology, 15, 275-284.
4. George, R.A., Spriggs, R.V., Bartlett, G.J., Gutteridge, A., MacArthur, M.W., Porter, C.T., Al-Lazikani, B., Thornton, J.M. and Swindells, M.B. (2005) Effective function annotation through catalytic residue conservation. Proceedings of the National Academy of Sciences of the United States of America, 102, 12299-12304.
5. Tian, W.D., Arakaki, A.K. and Skolnick, J. (2004) EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference. Nucleic Acids Research, 32, 6226-6239.
6. Kasuya, A. and Thornton, J.M. (1999) Three-dimensional structure analysis of PROSITE patterns. Journal of Molecular Biology, 286, 1673-1691.
7. Torrance, J.W., Bartlett, G.J., Porter, C.T. and Thornton, J.M. (2005) Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. Journal of Molecular Biology, 347, 565-581.
8. Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., De Castro, E., Langendijk-Genevaux, P.S., Pagni, M. and Sigrist, C.J.A. (2006) The PROSITE database. Nucleic Acids Research, 34, D227-D230.
9. Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O'Donovan, C., Phan, I. et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res, 31, 365-370.
10. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000) The Protein Data Bank. Nucleic Acids Research, 28, 235-242.
11. Porter, C.T., Bartlett, G.J. and Thornton, J.M. (2004) The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Research, 32, D129-D133.
12. Sheu, S.H., Lancia, D.R., Clodfelter, K.H., Landon, M.R. and Vajda, S. (2005) PRECISE: a database of predicted and consensus interaction sites in enzymes. Nucleic Acids Research, 33, D206-D211.
13. Meng, E.C., Polacco, B.J. and Babbitt, P.C. (2004) Superfamily active site templates. Proteins-Structure Function and Bioinformatics, 55, 962-976.
14. Cover, T.M. and Thomas, J.A. (1991) Elements of Information Theory, New York.
15. Nielsen, M.A. and Chuang, I.L. (2000) Quantum Computation and Quantum Information, UK.
16. Kullback, S. and Leibler, R.A. (1951) On Information and Sufficiency The Annals of Mathematical Statistics, 22, 79-86
17. Capra, J.A. and Singh, M. (2007) Predicting functionally important residues from sequence conservation. Bioinformatics, 23, 1875-1882.
18. Mirny, L.A. and Shakhnovich, E.I. (1999) Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. J Mol Biol, 291, 177-196.
19. Henikoff, S. and Henikoff, J.G. (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A, 89, 10915-10919.
20. Hsu, C.-M. (2007), Yuan Ze University, Taoyuan.
21. Wei, Y., Ko, J., Murga, L.F. and Ondrechen, M.J. (2007) Selective prediction of interaction sites in protein structures with THEMATICS. Bmc Bioinformatics, 8, -.
22. Ondrechen, M.J., Clifton, J.G. and Ringe, D. (2001) THEMATICS: a simple computational predictor of enzyme function from structure. Proc Natl Acad Sci U S A, 98, 12473-12478.
23. Kaplan, W. and Littlejohn, T.G. (2001) Swiss-PDB Viewer (Deep View). Brief Bioinform, 2, 195-197.
24. Ren, P. and Ponder, J.W. (2003) Polarizable Atomic Multipole Water Model for Molecular Mechanics Simulation. J. Phys. Chem, 107, 5933-5947.
25. Jorgensen, W.L., Chandrasekhar, J. and Madura, J.D. (1983) Comparison of simple potential functions for simulating liquid water. J. Chem. Phys., 79.
26. Madura, J.D., Briggs, J.M., Wade, R.C., Davis, M.E., Luty, B.A., Ilin, A., Antosiewicz, J., Gilson, M.K., Bagheri, B., Scott, L.R. et al. (1995) Simulations with the University of Houston Brownian Dynamics program. Computer Physics Communications, 91, 57-95.
27. Gilson, M.K. (1993) Multiple-site titration and molecular modeling: two rapid methods for computing energies and forces for ionizable groups in proteins. Proteins, 15, 266-282.
28. Smith, T.F. and Waterman, M.S. (1981) Identification of common molecular subsequences. J Mol Biol, 147, 195-197.
29. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25, 3389-3402.
30. Higgins, D.G. and Sharp, P.M. (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene, 73, 237-244.
31. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R. et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948.
32. Bartlett, G.J., Porter, C.T., Borkakoti, N. and Thornton, J.M. (2002) Analysis of catalytic residues in enzyme active sites. Journal of Molecular Biology, 324, 105-121.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41987-
dc.description.abstract大規模地以非人工的方式註解蛋白質的功能或序列特徵(signature),在後基因時代仍然是一項大挑戰,在此論文中,我們利用蛋白質的序列特徵設計一個預測方法,預測酵素序列的催化部位(catalytic sites)。我們的方法利用模體(motif)探勘的方式產生蛋白質序列特徵,每個序列特徵包含了幾個重要的殘基區塊,這些區塊也稱為保留性區塊(conserved segments),這些保留性區塊在同源序列上常常一起出現,它們在演化過程中被小心地保留下來,表示這些區塊有一定的重要性。依照生物實驗結果,酵素的催化殘基通常分散在蛋白質序列的不同區域,因此若要完整的預測催化殘基部位,產生的序列特徵也必須分散在蛋白質序列的不同區域。在本論文中,我們蒐集Catalytic Site Atlas (CSA)資料庫中的催化殘基資訊來評估我們所提出的預測方法之效能。測試結果顯示,我們的方法比PROSITE資料庫中的模板更能夠辨識催化部位和催化殘基。本論文將此研究方法實作成E1DS網站(http://e1ds.csbb.ntu.edu.tw/),E1DS目前有5421個序列特徵,這些序列特徵總共涵蓋932個4碼EC編號 ( numbers)。平均而言,在預測催化位置上,E1DS的正確率(correct)達到35.5%;成功猜測率(success rate)達到49.6%,而PROSITE的正確率及成功猜測率分別為18.9%及33.7%,在預測催化位置這部分,E1DS的正確率和成功猜測率均表現的比PROSITE理想。在預測催化殘基部分,E1DS的靈敏度(sensitivity)為30.0%,比PROSITE (16.2%)來得要好,但就明確度(specificity)而言,E1DS (96.7%)表現的比PROSITE (98.6%)來得差。zh_TW
dc.description.abstractLarge-scale automatic annotation for protein sequences remains challenging in post-genomics era. This thesis aims at predicting catalytic sites of enzyme sequences based on a repository of protein signatures. The employed sequence signatures are derived from a motif based method. The blocks of a signature, also called conserved regions, are composed of the key residues found among the homologues. These blocks are conserved during evolution because of their importance in protein functions. Biological experiments reveal that an enzyme catalytic site is usually constituted of residues that are largely separated in the sequence. To predict catalytic sites comprehensively, it is expected that the employed signatures must contain residues that are largely scattered in sequence. In this regard, we employ a recently developed pattern mining algorithm WildSpan for generating enzyme sequence signatures. WildSpan is well designed for discovering sequence motifs spanning a large number of unimportant positions. To measure the performance of our method, we collect the annotated catalytic sites for 831 enzymes from Catalytic Site Atlas (CSA). The results reveal that our method performs more effectively in identifying catalytic sites and catalytic residues than the patterns derived from PROSITE database. The proposed method has been realized in a web server named E1DS (http://e1ds.csbb.ntu.edu.tw/). E1DS currently contains 5421 sequence signatures that in total cover 932 4-digital EC numbers. In average, on the task of predicting catalytic sites, E1DS achieves a ‘correct’ rate of 35.5% and a ‘success rate’ of 49.6%, while the ‘correct’ and ’success’ rates of using PROSITE patterns are 18.9% and 33.7% respectively. On the other hand, on the task of predicting catalytic residues, the sensitivity rate of E1DS is 30.0%, better than that of PROSITE (16.2%), though the specificity rate of E1DS (96.7%) is slightly worse than that of PROSITE (98.6%).en
dc.description.provenanceMade available in DSpace on 2021-06-15T00:40:40Z (GMT). No. of bitstreams: 1
ntu-97-R95922108-1.pdf: 1043043 bytes, checksum: 089ea5f2e9d9f16bcf2d78cdd3ee36be (MD5)
Previous issue date: 2008
en
dc.description.tableofcontents誌謝 i
中文摘要 ii
英文摘要 iii
目錄 v
圖目錄 vii
表目錄 viii
第一章 緒論 1
第二章 相關研究 5
2.1 預測功能殘基 5
2.2 序列比對演算法 13
第三章 方法 16
3.0簡介 16
3.1資料蒐集 17
3.2序列特徵建構 18
3.3評估序列特徵 19
3.4預測方法 21
第四章 實驗 24
4.1 催化殘基資料集 24
4.2效能評估 26
第五章 網站 29
5.1首頁 29
5.2結果頁面 30
5.3錯誤訊息 34
第六章 結論 36
參考文獻 37
dc.language.isozh-TW
dc.subject酵素功能zh_TW
dc.subject蛋白質序列探勘zh_TW
dc.subject催化部位zh_TW
dc.subject序列特徵zh_TW
dc.subjectEC編號zh_TW
dc.subjectCatalytic siteen
dc.subjectEnzyme functionen
dc.subjectEC numberen
dc.subjectSignatureen
dc.subjectSequential pattern miningen
dc.title利用序列特徵探勘預測酵素催化部位zh_TW
dc.titlePrediction of enzyme catalytic sites by sequential pattern miningen
dc.typeThesis
dc.date.schoolyear97-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳倩瑜(Chien-Yu Chen),張天豪(Tien-Hao Chang)
dc.subject.keyword蛋白質序列探勘,催化部位,序列特徵,EC編號,酵素功能,zh_TW
dc.subject.keywordSequential pattern mining,Catalytic site,Signature,EC number,Enzyme function,en
dc.relation.page39
dc.rights.note有償授權
dc.date.accepted2008-08-27
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-97-1.pdf
  未授權公開取用
1.02 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved