請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/26617完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 翁昭旼(Jau-Min Wong,) | |
| dc.contributor.author | Jing-Xu Zheng | en |
| dc.contributor.author | 鄭景旭 | zh_TW |
| dc.date.accessioned | 2021-06-08T07:17:40Z | - |
| dc.date.copyright | 2008-08-05 | |
| dc.date.issued | 2008 | |
| dc.date.submitted | 2008-07-25 | |
| dc.identifier.citation | [1]. Valencia, C.B.a.A., The Frame-Based Module of the SUISEKI Information Extraction System. IEEE Intelligent Systems,, March/April 2002. 17: p. 14-20.
[2]. N. Daraselia, A.Y., S. Egorov, S. Novichkova, and a.I.M. A. Nikitin, Extracting human protein interactions from MEDLINE using a full-sentence parser. Bioinformatics, 2004. 20(5): p. 604-611. [3]. E. Marcotte, I.X., and D. Eisenberg, Mining Literature for Protein Interactions. Bioinformatics, April 2001. 17: p. 359-363. [4]. D. Proux, F.R., and L. Julliard. A pragmatic information extraction strategy for gathering data on genetic interactions. In Proc 9th Intl Conf on Intelligent Systems for Molecular Biology. 2000. [5]. J. Pustejovsky, J.C., J. Zhang, M. Kotecki, and and B. Cochran. Robust Relational Parsing over Biomedical Literature: Extracting Inhibit Relations. In Proc 7th Pac Symp Biocomput. 2002. [6]. T. Sekimizu, H.S.P., and J. Tsujii, Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts. In Proc Genome Informatics, 1998. 9: p. 62-71. [7]. Corney, D.P.A., B. F. Buxton, BioRAT: extracting biological information from full-length papers. Bioinformatics, 2004. 20(17): p. 3206-3213. [8]. A Koike, Y.K., T Takagi, Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource. Genome Research, 2003: p. 1231-1243. [9]. T. Ono, H.H., A. Tanigami, and T. Takagi, Automated extraction of information on protein-protein-interactions from the biological literature. Bioinformatics, 2001. 17(2): p. 155-161. [10]. Deyu Zhou *, Y.H., Extracting interactions between proteins from the literature. Biomedical Informatics, March 2007. [11]. James Thomas, D.M., Christos Ouzounis, Stephen Pulman,Mark Carroll. Automatic extraction of protein interactions from scientific abstracts. In: Proceedings of the Pacific symposium on biocomputing. 2000. Hawaii, USA. [12]. Leroy Gondy, C.H., Martinez Jesse D, A shallow parser based on closed-class words to capture relations in biomedical text. JBiomed Informatics, 2003. 36(3): p. 145-58. [13]. Wong, L. PIES, a protein interaction extraction system. in In:Proceedings of the Pacific symposium on biocomputing. 2001. Hawaii,USA. [14]. Chiang Jung-Hsien, Y.H.-C., Hsu Huai-Jen, GIS: a biomedical text-mining system for gene information discovery. Bioinformatics, 2004. 20(1): p. 120-1. [15]. Huang Minlie, Z.X., Hao Yu, Discovering patterns to extract protein-protein interactions from full text. Bioinformatics, 2004. 20(18): p. 3604-12. [16]. Hong-Woo Chun, Y.-S.H., Hae-Chang Rim. Unsupervised event extraction from biomedical literature using co-occurrence information and basic patterns. in In: The lecture notes in artificial intelligence. 2005. [17]. Hao Yu, Z.X., Huang Minlie, Li Ming, Discovering patterns to extract protein-protein interactions from the literature: Part II. Bioinformatics, 2005. 21(15): p. 3294-300. [18]. Proux Denys, R.F., Julliard Laurent. A pragmatic information extraction strategy for gathering data on genetic interactions. In: Proceedings of the eighth International conference on intelligent systems for molecular biology.AAAI Press. 2000. [19]. Chen, G.L.a.H. Filling preposition-based templates to capture information from medical abstracts. in In: Pacific symposium biocomputing. 2002. [20]. Mark Craven, J.K. Constructing biological knowledge bases by extracting information from text sources. In: Proceedings of the 7th International conference on intelligent systems for molecular biology. 1999. Heidelberg, Germany. [21]. Jing Jiang , C.Z., An empirical study of tokenization strategies for biomedical information retrieval. Information Retrieval, 2007. [22]. Tsuruoka, Y., Y. Tateishi, et al. Developing a Robust Part-of-Speech Tagger for Biomedical Text. in Advances in Informatics - 10th Panhellenic Conference on Informatics, LNCS. 2005. [23]. Park, J.C., Using combinatory categorial grammar to extract biomedical information. IEEE Intelligent Systems,, 2001. 16(6): p. 62-67. [24]. Entrez PubMed,http://www.ncbi.nlm.nih.gov/entrez. [25]. Genia Tagger,http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger/. [26]. JavaRAP,http://www.comp.nus.edu.sg/~qiul/NLPTools/JavaRAP.html. [27]. Swiss-Prot,http://www.expasy.org/cgi-bin/sprot-search-de. [28]. MESH,http://www.nlm.nih.gov/mesh/. [29]. GuoDong Zhou, J.S. Named entity recognition using an HMM-based chunk tagger.In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2001. [30]. Michael Krauthammer , G.N., Term identification in the biomedical literature. Biomedical Informatics, 2004. 37(6): p. 512-526. [31]. Collier, K.T.a.N. Use of Support Vector Machines in Extended Named Entity Recognition. In Proceedings of the sixth conference on natural language learning. 2002. [32]. Relevant terms used for oncogene expression / pharmacology filters. NIH, 1999. [33]. Friedman, C., Kra,P. Yu,H., Krauthammer,M. and Rzhetsky,A, GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics, 2001. 17: p. S74-S82. [34]. Gilder, J.M.T.a.M.R., Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics, 2003(19(16)): p. 2046-2053. [35]. Li, Y.-Y., Using Link Grammar Parsing and Phrase Patterns to Extract Protein-protein Interactions. Graduate Institute of Biomedical Engineering College of Medicine and College of Engineering National Taiwan University Master Thesis, 2007. [36]. Toshihide Ono , H.H., Akira Tanigami and Toshihisa Takagi, Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics, 2000. 17: p. 155-161. [37]. Asako Koike , Y.N.a.T.T., Automatic extraction of gene/protein biological functions from biomedical text. Bioinformatics, 2005. 21(7): p. 1227-1236. [38]. Conrad Plake, J.H., Ulf Leser, Optimizing Syntax Patterns for Discovering Protein-Protein Interactions. ACM, 2005. [39]. the Gene Ontology,http://www.geneontology.org/. [40].LLL05 challenge,http://genome.jouy.inra.fr/texte/LLLchallenge/. [41]. BioCreAtIvE,http://biocreative.sourceforge.net/news.html. [42]. Gilder, J.M.T.a.M.R., Extraction of protein interaction information from unstructured text using a context-free grammar. Bioinformatics, 2003. 19(16): p. 2046-2053. [43]. Mark A. Greenwood, M.S., Yikun Guo, Henk Harkema, Angus Roberts, Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System. 2005. [44]. MINIPAR,http://www.cs.ualberta.ca/~lindek/minipar.htm. [45]Marios S., Mark C., Souyma R., Hierarchical hidden markov models for information extraction. Proceedings of the 18th International joint conference on artificial intelligence. 2003. [46]Fundel K., Kuffner R., Zimmer R., RelEx-Relation extraction using dependency parse trees. Bioinformatics, 200;23(3):365-71. [47]Donaldson I., Martin J., Bruijn B., Wolting C., Lay V., PreBIND and textomy-mining the biomedical literacture for protein-protein interactions using a support vector machine. BMC Bioinformat 2003;4(11). [48]Jorg H., Ulf L., Harald K., Dietrich R. S., Collecting a large corpus from all of Medline. [49]DIP, http://dip.doe-mbi.ucla.edu/ | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/26617 | - |
| dc.description.abstract | 隨著生物醫學和很多分析方的快速發展,使用文件探勘工具去尋找蛋白質間交互作用變得越來越重要。現今研究學者藉由閱讀生醫文獻以獲得重要資訊,但生醫文獻的數量卻以驚人的速度成長,如果以人工擷取資訊,將會耗費大量人力跟時間,因此從文件中自動擷取重要訊息的需求量增加。
我們利用了淺層剖析器跟考量句子結構,發展了一個能從文獻中自動擷取蛋白質間交互作用的資訊系統。我們系統比對句子的文法樣式跟傳統作法不同。我們設計有效率的演算法並考量句子的語意制定一些規則以擷取蛋白質交互關係,而關係中並區分出有作用蛋白質跟被作用蛋白質。我們的系統由以下數個步驟所組成,分別是醫學文獻前處理、斷句、斷字、詞類標記、蛋白質名詞辨識、描述交互作用的關鍵字、介係詞及連接詞標記、蛋白質間交互作用的擷取。最後利用兩個測試集來評估此系統,分別是 LLL05競賽與BioCreAtIvE-PPI。 | zh_TW |
| dc.description.abstract | With the rapid progress of biomedical science and large amounts of analysis methods, many researchers nowadays access knowledge about protein-protein interaction through PubMed abstracts, but the amount of biomedical literature is enormous and continues to grow at exponential rate. Therefore, the demand for automatic extraction of information from text has been increasing, using text mining tools to find knowledge such as protein-protein interactions, which is useful for specific analysis tasks has become critical.
We develop a system which can automatically extracts protein-protein interactions from free text using a shallow parser and sentence structure analysis techniques. Our system matches sentences against syntax patterns typically describing protein-protein interactions. We design an efficient algorithm and develop a set of rules which extracts protein-protein interactions from their syntactic roles. Protein-protein interactions include ACTOR ( doner of action) and OBJECT (receiver of action).There are essential steps to accomplish our system which includes preprocessor, sentence splitting, tokenization, part-of-speech tagging, protein names recognition, interaction keywords , prepositions , conjunction tagging and protein-protein interactions extracting. Finally, we evaluate our system on two samples, one derived from the LLL05 challenge, the other from BioCreAtIvE-PPI. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T07:17:40Z (GMT). No. of bitstreams: 1 ntu-97-R95548053-1.pdf: 750538 bytes, checksum: 5cb46e0b7543f71ae34c0e32b98e0c44 (MD5) Previous issue date: 2008 | en |
| dc.description.tableofcontents | 口試委員會審定書
誌謝 ………………………………………………………………… i 中文摘要 …………………………………………………………… ii 英文摘要 …………………………………………………………… iii 目錄 ………………………………………………………………… iv 圖目錄 ……………………………………………………………… vi 表目錄 …………………………………………………………… viii 第一章 緒論…………………………………………………………… 1 1.1研究背景與動機…………………………………………………1 1.2研究目的………………………………………………………….1 1.3論文架構......………………………………………………… 2 第二章 相關文獻……………………………………………………. 3 2.1型樣比對的研究……………………………………………… . 3 2.2蛋白質交互作用方法論……………………………………….. 4 2.3自然語言處理…………………………………………………. 6 第三章 材料與方法………………………………………………….. 9 3.1 材料………………………………………………………….. 9 3.2 系統設計……………………………………………………….15 3.3系統架構及流程………………………………………………..16 3.4蛋白質間交互作用的擷取………………………………….. 30 第四章 視覺化蛋白質交互作用呈現………………………………..39 第五章 結果和討論……………………………………………………44 5.1 LLL05競賽…………………………………………………….44 5.2 BioCreAtIvE-PPI………………………………………….. 45 5.3 討論………………………………………………………….. 46 第六章 結論…………………………………………………………. 49 6.1 結論……………...……………………………………………49 6.2 限制……………...……………………………………………49 6.3 未來的工作…………………………………………………….49 參考文獻……………………………………………………………….50 | |
| dc.language.iso | zh-TW | |
| dc.subject | 生醫名詞的辨識 | zh_TW |
| dc.subject | 蛋白質和蛋白質間的交互作用 | zh_TW |
| dc.subject | 文字探勘 | zh_TW |
| dc.subject | 語意型樣 | zh_TW |
| dc.subject | 斷句 | zh_TW |
| dc.subject | 斷字 | zh_TW |
| dc.subject | 詞類標記 | zh_TW |
| dc.subject | part-of-speech | en |
| dc.subject | protein names recognition | en |
| dc.subject | protein-protein interaction | en |
| dc.subject | text mining | en |
| dc.subject | syntax patterns | en |
| dc.subject | sentence splitting | en |
| dc.subject | tokenization | en |
| dc.title | 從醫學文獻摘要擷取蛋白質之間的交互作用 | zh_TW |
| dc.title | Extracting Interactions between Proteins from PubMed
Abstract | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 96-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 蔣以仁(I-Jen Chiang),陳中明(Chung-Ming Chen) | |
| dc.subject.keyword | 蛋白質和蛋白質間的交互作用,文字探勘,語意型樣,斷句,斷字,詞類標記,生醫名詞的辨識, | zh_TW |
| dc.subject.keyword | protein-protein interaction,text mining,syntax patterns,sentence splitting,tokenization,part-of-speech,protein names recognition, | en |
| dc.relation.page | 54 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2008-07-28 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 醫學工程學研究所 | zh_TW |
| 顯示於系所單位: | 醫學工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-97-1.pdf 未授權公開取用 | 732.95 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
