請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99076| 標題: | 成對短基因序列回貼於全功能次世代定序資料分析加速系統之設計與實現 Design and Implementation of Paired-End Short-Read Mapping in End-to-End Acceleration System for Next-Generation Sequencing Data Analysis |
| 作者: | 楊仲萱 Chung-Hsuan Yang |
| 指導教授: | 楊家驤 Chia-Hsiang Yang |
| 關鍵字: | 次世代基因定序,成對短序列回貼,特定應用積體電路,現場可程式化邏輯閘陣列, Next-generation sequencing (NGS),Paired-end short-read mapping,Application specific integrated circuit (ASIC),Field programmable gate array (FPGA), |
| 出版年 : | 2025 |
| 學位: | 博士 |
| 摘要: | 現今,次世代基因定序技術已被廣泛運用並持續蓬勃發展中,是一快速且有效率的定序方法。於本論文中,設計並提出第一個全功能的次世代基因定序資料分析之硬體加速系統,包含成對短序列回貼、半倍體搜尋、變體識別以及基因分型識別,在演算法和硬體共同優化下,提出相對應架構。於特定應用積體電路 (ASIC) 與現場可程式化邏輯閘陣列 (FPGA) 平台皆可以成功實現。針對最耗時之成對短序列回貼步驟,在 FPGA 平台上實現,提出四個優化方法下,減少總運算時間高達 92.6%,達每秒 75.6M-bp 的通量和最高的 99.3% 回貼正確率,達到 1.7 到 18.6 倍的速度提升。若以兩張 FPGA 板實現全功能次世代基因定序資料分析,則可實機展示成果,亦有客製化之使用者友善 GUI,方便操作以及研究人員閱讀、評估。本論文亦進一步提出方法和架構對成對短序列回貼優化速度,並將完整分析流程,以台積電 28 奈米製程下線,面積為 16.14 平方毫米。消耗功率為 2.73 瓦,在電壓 0.9V 下且運算時脈為 400MHz。此晶片可以在 28.2 分鐘內就完成美國食品藥物管理局所公認標準的 PrecisionFDA 人類全基因測試資料分析(50× NA12878),精確度高達 99.79%,靈敏度高達 99.03%。達到和商用軟體與現今硬體方法相匹配的準確度下,提供 3 到 59 倍的通量提升。在能量效率下,也有 935 至 4910 倍的進步。 Nowadays, next-generation sequencing (NGS) technology has been employed widely and is flourishing continuously as an efficient sequencing method. In this dissertation, the first end-to-end hardware-accelerated system for NGS data analysis is designed and proposed, including paired-end short-read mapping, haplotype calling, variant calling,and genotype calling. Through algorithm-architecture co-optimization, the corresponding hardware is able to be implemented successfully on both ASICs and FPGA platforms. For the most time-consuming paired-end short-read mapping step, 4 optimization strategies are implemented on an FPGA platform, resulting in a significant 92.6% latency reduction in total, and achieving a throughput of 75.6M base pairs per second with a mapping accuracy of 99.3%, leading to a 1.7-to-18.6× higher throughput. With two FPGA boards, the whole process of NGS data analysis can be demonstrated with real-time results. Also, a user-friendly graphical user interface (GUI) is customized, facilitating easy operation and assessment by researchers. Furthermore, one more specific method and corresponding hardware for paired-end short-read mapping is proposed. Implemented in the NGS data analysis accelerator with the complete analysis pipeline, it is fabricated in a TSMC 28nm CMOS technology with an area of 16.14 mm2. It consumes 2.73 watts at 400MHz, supplied by a 0.9V source. The PrecisionFDA human whole-genome dataset (50× NA12878) for benchmarking can be analyzed within only 28.2 minutes, with a precision of 99.79% and a sensitivity of 99.03%. Compared with commercial software and current hardware accelerators, it has sufficient accuracy performance while providing a 3-to-59× throughput improvement and also achieves a 935-to-4910 times higher energy efficiency. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99076 |
| DOI: | 10.6342/NTU202304338 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 未授權公開取用 | 6.14 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
