Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/718
Title: 以核苷酸k聚體頻度分類序列
Sequence Classification Based on k­-mer Frequencies
Authors: Hung-Yu Chen
陳泓宇
Advisor: 趙坤茂(Kun-Mao Chao)
Keyword: 序列分類,環境基因體學,基因體學,k聚體,免序列比對,序列特徵,演算法,
sequence classification,metagenomics,genomics,k-mer,alignment-free,sequence signature,algorithm,
Publication Year : 2019
Degree: 碩士
Abstract: 序列分類在計算生物學的許多研究中是一個在研究初期就需要解決之問題,有許多方法被研發出來計算此問題,但隨著高通量定序技術的發展,需要計算的資料量也大幅增加,導致許多現有方法已無法在能取得的計算資源及可接受的時間內完成計算。以核苷酸k聚體為基礎的演算法就是其中一種,目前已有不少方法可以快速且準確的完成分類,但卻需要大量的計算空間,因此無法在一般個人電腦中完成計算。
在本篇論文中,我們提出一個以核苷酸k聚體為基礎的演算法,在時間上與現有方法相當,在空間上則避免現有方法中儲存上的冗餘性而做出改善。為進一步降低所需記憶體空間,我們提出一個分割架構,此架構除了可以減少所需空間,也適合平行化以縮短計算所需時間。
Sequence classification is a preliminary step in many researches of computational biology. There are a variety of methods proposed to compute this problem. However, with the development of high-throughput sequencing technologies, the datasets of sequencing data are getting much larger. As a result, many existing methods cannot accomplish this task with limited computational resource and acceptable time. The k-mer based algorithms are some of these methods. Most of them could finish the classification fast and accurately, but they need large computational space, which is not available in common personal computers.
In this thesis, we propose a k-mer based algorithm. The time complexity of our algorithm is comparable to those of the existing methods, while we make an improvement in space usage by avoiding the redundancy of storing the k-mers. To further reduce the memory usage, we propose a partitioning strategy. In addition to the reduction in memory usage, the algorithm under this partitioning structure can be highly parallelized to improve performance.
URI: http://tdr.lib.ntu.edu.tw/handle/123456789/718
DOI: 10.6342/NTU201902038
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:生醫電子與資訊學研究所

Files in This Item:
File SizeFormat 
ntu-108-1.pdf1.99 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved