Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/35957
Title: 蛋白質最大序列樣式探勘演算法
Mining Maximal Sequential Patterns in Protein Databases
Authors: Yu Ling
凌宇
Advisor: 李瑞庭
Keyword: 蛋白質,最大序列樣式,字尾樹,
protein,maximal sequential pattern,suffix tree,
Publication Year : 2005
Degree: 碩士
Abstract: 蛋白質序列中的序列樣式,和蛋白質所執行的功能有著密不可分的關係。因此,如何從蛋白質序列資料庫中,透過系統化的方法,探勘出重要的序列樣式,已成為一個相當重要的研究課題。
針對此一課題,本論文提出了一個以字尾樹(suffix tree)為基礎的演算法。演算法中的所有過程,包括探勘序列樣式中的封閉性高頻子字串(closed frequent substring),將封閉性子字串組成最大高頻序列樣式(maximal frequent sequential pattern),以及調整序列樣式中子字串間的間隔(gap)等,皆可利用字尾樹中所記錄的發生資訊(occurrence information)來完成。而為了確保序列樣式的精簡性,我們的演算法刪減了不必要的序列樣式,僅保留最大序列樣式。由實驗的結果顯示,我們的演算法不僅能夠找出PROSITE資料庫中所記錄的序列樣式,並且還發現了其他一些值得提供生物學家進一步研究的結果,例如更長的序列樣式,及分類樣式集合(classifier pattern set)等。另外,我們演算法在實驗中,也展現了較 Chang and Halgamuge 的方法更為優良的結果。
Because of the close relationship between sequential patterns and protein function, systematically mining significant sequential patterns in protein databases has become an important research topic.
In this thesis, we proposed a suffix-tree-based algorithm to discover patterns in protein databases. We use the occurrence information maintained in the suffix tree to mine closed frequent substrings, generate maximal frequent sequential patterns, and adjust the gaps within the patterns. To ensure the compactness of the patterns we generate, we do not generate all patterns but only maximal patterns. From the experimental results, our proposed algorithm can find not only the patterns recorded in PROSITE database, but also some other patterns worth of further biological studying, such as longer patterns and the classifier pattern set. Besides, our proposed algorithm generates better results than those of Chang and Halgamuge’s method in the experiment.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/35957
Fulltext Rights: 有償授權
Appears in Collections:資訊管理學系

Files in This Item:
File SizeFormat 
ntu-94-1.pdf
  Restricted Access
673.8 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved