請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99076完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 楊家驤 | zh_TW |
| dc.contributor.advisor | Chia-Hsiang Yang | en |
| dc.contributor.author | 楊仲萱 | zh_TW |
| dc.contributor.author | Chung-Hsuan Yang | en |
| dc.date.accessioned | 2025-08-21T16:17:39Z | - |
| dc.date.available | 2025-08-22 | - |
| dc.date.copyright | 2025-08-21 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-08-04 | - |
| dc.identifier.citation | [1] N. Kathiresan, R. Temanni, H. Almabrazi, N. Syed, P. V. Jithesh, and R. Al-Ali, “Accelerating next generation sequencing data analysis with system level optimizations,” Scientific Reports, vol. 7, pp. 333-351, August 2016.
[2] H. Nakagawa and M. Fujita, “Whole genome sequencing analysis for cancer genomics and precision medicine,” Cancer Science, vol. 109, pp. 513-522, March 2018. [3] R. Stokowski et al., “Clinical performance of non‐invasive prenatal testing (nipt) using targeted cell‐free dna analysis in maternal plasma with microarrays or next generation sequencing (ngs) is consistent across multiple controlled clinical studies,” Prenatal diagnosis, vol. 35, issue 12, pp. 1243-1246, Sep. 2015. [4] C. J. Houldcroft, M. A. Beale, and J. Breuer, “Clinical and biological insights from viral genome sequencing,” Nature Review Microbiology, vol. 15, pp. 183-192, Mar. 2017. [5] L. Orlando and M. Thomas P. Gilbert and E. Willerslev, “Reconstructing ancient genomes and epigenomes,” Nature Review Genetics, vol. 16, p. 395-408, June 2015. [6] S. Goodwin, J. D. McPherson, and W. R. McCombie, “Coming of ages, ten years of next-generation sequencing technologies,” Nature Rev. Genetics, vol. 17, pp. 333-351, May 2016. [7] Illumina, “Illumina sequencing platforms.” Accessed: Mar. 11, 2022. [Online]. Available: https://www.illumina.com/systems/sequencing-platforms.html. [8] A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. A. DePristo, “The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna sequencing data,” GENOME RESEARCH, vol. 20, pp. 1297-1303, Sep 2010. [9] S. Canzar and S. L. Salzberg, “Short read mapping: An algorithm tour,” Proc. IEEE, vol. 105, pp. 436–458, Mar. 2017. [10] A. Prabhakaran, B. Shifaw, M. Naik, P. Narvaez, G. Vander Auwera, G. Powley, S. Osokin, and G. Srinivasa, “Infrastructure for deploying gatk best practices pipeline.” Accessed: Nov. 24, 2022. [Online]. Available: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/deploying-gatk-best-practices-paper.pdf. [11] Y.-C. Wu, Y.-L. Chen, C.-H. Yang, C.-H. Lee, C.-Y. Yu, N.-S. Chang, L.-C. Chen, J.-R. Chang, C.-P. Lin, H.-L. Chen, C.-S. Chen, J.-H. Hung, and C.-H. Yang, “A fully integrated genetic variant discovery soc for next-generation sequencing,” Int. Solid-State Circuits Conference (ISSCC), pp. 322-323, Feb. 2020. [12] Y.-C. Wu, Y.-L. Chen, C.-H. Yang, C.-H. Lee, C.-Y. Yu, N.-S. Chang, L.-C. Chen, J.-R. Chang, C.-P. Lin, H.-L. Chen, C.-S. Chen, J.-H. Hung, and C.-H. Yang, “A 975-mw fully integrated genetic variant discovery system-on-chip in 28 nm for next-generation sequencing,” IEEE Journal of Solid-Solid Circuits (JSSC), vol. 56, pp. 123–135, Jan. 2021. [13] C.-H. Chang, M.-T. Chou, Y.-C. Wu, T.-W. Hong, Y.-L. Li, C.-H. Yang, and J.-H. Hung, “sBWT: memory efficient implementation of the hardware acceleration-friendly Schindler transform for the fast biological sequence mapping,” Bioinformatics, vol. 32, pp. 3498-3500, July 2016. [14] H. Li and R. Durbin, “Fast and accurate short read alignment with Burrows-wheeler transform,” Bioinformatics, vol. 25, pp. 1754-1760, May 2009. [15] B. Langmead and S. L. Salzberg, “Fast gapped-read alignment with bowtie 2,” Nature Methods, vol. 9, pp. 357–359, Apr. 2012. [16] R. Li, C. Yu, Y. Li, T.-W. Lam, and S.-M. Yiu, “SOAP2: and improved ultrafast tool for short read alignment,” Bioinformatics, vol. 25, pp. 1966–1967, June 2009. [17] P. Ferragina and G. Manzini, “Opportunistic data structures with applications,” in Proceedings of 41st Annual Symposium on Foundations of Computer Science, 2000. [18] Z. Nawaz, K. Bertels, and H. E. Sumbul, “Fast smith-waterman hardware implementation,” in Proc. IEEE Symp. Parallel Distributed Processing, pp. 1–4, Apr. 2010. [19] T. Smith and M. Waterman, “Identification of common molecular subsequences,” J. Mol. Biol., vol. 147, pp. 195–197, Mar. 1981. [20] H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, and R. Durbin, “The sequence alignment/map format and SAMtools,” Bioinformatics, vol. 25, pp. 2078–2079, June 2009. [21] D. Adgeroh, T. Bell, and A. Mukherjee, The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer Publishing Company, 2008. [22] Y.-C. Wu, C.-H. Chang, J.-H. Hung, and C.-H. Yang, “A 135-mw fully integrated data processor for next-generation sequencing,” IEEE Transactions on Biomedical Circuits and Systems (TBioCAS), vol. 11, pp. 1216–1225, Dec. 2017. [23] Y.-L. Chen, B.-Y. Chang, C.-H. Yang, and T.-D. Chiueh, “A high-throughput fpga accelerator for short-read mapping of the whole human genome,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, pp. 1465–1478, June 2021. [24] H. M. Waidyasooriya and M.Hariyama, “Hardware-acceleration of short-read alignment based on the Burrows-Wheeler transform,” IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 27, pp. 1358–1372, May 2016. [25] C.-H. Yang, Y.-C. Wu, Y.-L. Chen, C.-H. Lee, J.-H. Hung and C.-H. Yang, “An fm index based high-throughput memory-efficient fpga accelerator for paired-end short-read mapping,” IEEETransactions on Biomedical Circuits and Systems (TBioCAS), vol. 17, no. 6, pp. 1331–1341, 2023. [26] B. M. N. Homer and S. F. Nelson, “Bfast: an alignment tool for large scale genome resequencing,” Public Library of Science (PLoS) One, vol. 4, no. 11, 2009. [27] C. W. Yu, K. H. Kwong, K.-H. Lee, and P. H. W. Leong, “New algorithms, architectures and applications for reconfigurable computing,” Springer, pp. 291–300, 2005. [28] N. Ahmed, K. Bertels, and Z. Al-Ars, “A comparison of seed-and-extend techniques in modern dna read alignment algoorithms,” in Int. Conf. Bioinformatics and Biomedicine (BIBM), pp. 1421–1428, Dec. 2016. [29] A. Auton and G. R. Abecasis, “A global reference for human genetic variation,” Nature, vol. 526, pp. 68–74, Oct. 2015. [30] Y.-L. Chen, C.-H. Yang, Y.-C. Wu, C.-H. Lee, W.-C. Chen, L.-Y. Lin, N.-S. Chang, C.-P. Lin, C.-S. Chen, J.-H. Hung, C.-H. Yang, “A fully integrated end-to-end genome analysis accelerator for next-generation sequencing,” Int. Solid-State Circuits Conference (ISSCC), pp. 44–45, Feb 2023. [31] Y.-C. Wu, J.-H. Hung, and C.-H. Yang, “A 135mw fully integrated data processor for next-generation sequencing,” in Proc. Int. Solid-State Circuits Conf. Digest Tech. Papers, pp. 252–253, Feb. 2017. [32] C.-H. Yang, Y.-C. Wu, Y.-L. Chen, C.-H. Lee, J.-H. Hung, and C.-H.Yang, “A 75.6m base-pairs/s fpga accelerator for fm-index based paired-end short-read mapping,” IEEE Asian Solid-State Circuits Conference (A-SSCC), Nov 2022. [33] The Food and Drug Administration (FDA), “Precisionfda truth challenge.” Accessed: Jan. 15, 2022. [Online]. Available: https://precision.fda.gov/challenges/truth. [34] C. B. Olson, M. Kim, C. Clauson, B. Kogon, C. Ebeling, S. Hauck, and W. L. Ruzzo, “Hardware acceleration of short read mapping,” in Proc. IEEE Int. Symp. Field-Programmable Custom Computing Machines (FCCM), pp. 161–168, Apr. 2012. [35] S. Zhao, OlegAgafonov, A. Azab, T. Stokowy, and E. Hovig, “Accuracy and efficiency of germline variant calling pipelines for human genome data,” vol. 10, Nov 2020. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99076 | - |
| dc.description.abstract | 現今,次世代基因定序技術已被廣泛運用並持續蓬勃發展中,是一快速且有效率的定序方法。於本論文中,設計並提出第一個全功能的次世代基因定序資料分析之硬體加速系統,包含成對短序列回貼、半倍體搜尋、變體識別以及基因分型識別,在演算法和硬體共同優化下,提出相對應架構。於特定應用積體電路 (ASIC) 與現場可程式化邏輯閘陣列 (FPGA) 平台皆可以成功實現。針對最耗時之成對短序列回貼步驟,在 FPGA 平台上實現,提出四個優化方法下,減少總運算時間高達 92.6%,達每秒 75.6M-bp 的通量和最高的 99.3% 回貼正確率,達到 1.7 到 18.6 倍的速度提升。若以兩張 FPGA 板實現全功能次世代基因定序資料分析,則可實機展示成果,亦有客製化之使用者友善 GUI,方便操作以及研究人員閱讀、評估。本論文亦進一步提出方法和架構對成對短序列回貼優化速度,並將完整分析流程,以台積電 28 奈米製程下線,面積為 16.14 平方毫米。消耗功率為 2.73 瓦,在電壓 0.9V 下且運算時脈為 400MHz。此晶片可以在 28.2 分鐘內就完成美國食品藥物管理局所公認標準的 PrecisionFDA 人類全基因測試資料分析(50× NA12878),精確度高達 99.79%,靈敏度高達 99.03%。達到和商用軟體與現今硬體方法相匹配的準確度下,提供 3 到 59 倍的通量提升。在能量效率下,也有 935 至 4910 倍的進步。 | zh_TW |
| dc.description.abstract | Nowadays, next-generation sequencing (NGS) technology has been employed widely and is flourishing continuously as an efficient sequencing method. In this dissertation, the first end-to-end hardware-accelerated system for NGS data analysis is designed and proposed, including paired-end short-read mapping, haplotype calling, variant calling,and genotype calling. Through algorithm-architecture co-optimization, the corresponding hardware is able to be implemented successfully on both ASICs and FPGA platforms. For the most time-consuming paired-end short-read mapping step, 4 optimization strategies are implemented on an FPGA platform, resulting in a significant 92.6% latency reduction in total, and achieving a throughput of 75.6M base pairs per second with a mapping accuracy of 99.3%, leading to a 1.7-to-18.6× higher throughput. With two FPGA boards, the whole process of NGS data analysis can be demonstrated with real-time results. Also, a user-friendly graphical user interface (GUI) is customized, facilitating easy operation and assessment by researchers. Furthermore, one more specific method and corresponding hardware for paired-end short-read mapping is proposed. Implemented in the NGS data analysis accelerator with the complete analysis pipeline, it is fabricated in a TSMC 28nm CMOS technology with an area of 16.14 mm2. It consumes 2.73 watts at 400MHz, supplied by a 0.9V source. The PrecisionFDA human whole-genome dataset (50× NA12878) for benchmarking can be analyzed within only 28.2 minutes, with a precision of 99.79% and a sensitivity of 99.03%. Compared with commercial software and current hardware accelerators, it has sufficient accuracy performance while providing a 3-to-59× throughput improvement and also achieves a 935-to-4910 times higher energy efficiency. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T16:17:38Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-08-21T16:17:39Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgements i
摘要 iii Abstract iv Contents vi List of Figures viii List of Tables x Denotation xi Chapter 1 Introduction 1 1.1 Next-Generation Sequencing (NGS) 2 1.2 NGS Data Analysis 4 1.3 Contribution of This Work 7 1.4 Dissertation Structure 8 Chapter 2 Background and Algorithms of Short-Read Mapping 9 2.1 Memory-Efficent FM-Index 11 2.2 Workflow of Short-Read Mapping 13 2.2.1 Exact Matching with FM-index 14 2.2.2 Inexact Matching for Similarity Assessment 16 2.3 Paired-EndSequencingandMapping 16 Chapter 3 Algorithm-architecture Optimization 18 3.1 OCC-BWT Interleaved Data 18 3.2 LF-mapping Enhanced with LUT 21 3.3 Repetitive Candidates Disposal 22 3.4 Midway Process Termination 24 3.5 Quick Similarity Score Calculation 25 Chapter 4 System Architecture of Accelerators 28 4.1 FPGA Implementation 28 4.1.1 Read Splitter and LF-Mapper 29 4.1.2 OCC Restorer 30 4.1.3 SA Retriever and Candidates Filter 31 4.1.4 Dynamic Programming (DP) Array 33 4.2 ASIC Realization 34 4.2.1 Quick Similarity Score Calculator 36 4.2.2 Candidates Pairer and Re-Aligner 37 Chapter 5 Experimental Verification 38 5.1 Performance Comparison 41 5.2 System Demonstration 44 5.3 Chip Implementation 47 Chapter 6 Conclusion 48 References 50 | - |
| dc.language.iso | en | - |
| dc.subject | 成對短序列回貼 | zh_TW |
| dc.subject | 現場可程式化邏輯閘陣列 | zh_TW |
| dc.subject | 特定應用積體電路 | zh_TW |
| dc.subject | 次世代基因定序 | zh_TW |
| dc.subject | Paired-end short-read mapping | en |
| dc.subject | Next-generation sequencing (NGS) | en |
| dc.subject | Field programmable gate array (FPGA) | en |
| dc.subject | Application specific integrated circuit (ASIC) | en |
| dc.title | 成對短基因序列回貼於全功能次世代定序資料分析加速系統之設計與實現 | zh_TW |
| dc.title | Design and Implementation of Paired-End Short-Read Mapping in End-to-End Acceleration System for Next-Generation Sequencing Data Analysis | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 博士 | - |
| dc.contributor.oralexamcommittee | 洪瑞鴻;張錫嘉;闕志達;蔡佩芸 | zh_TW |
| dc.contributor.oralexamcommittee | Jui-Hung Hung;Hsie-Chia Chang;Tzi-Dar Chiueh;Pei-Yun Tsai | en |
| dc.subject.keyword | 次世代基因定序,成對短序列回貼,特定應用積體電路,現場可程式化邏輯閘陣列, | zh_TW |
| dc.subject.keyword | Next-generation sequencing (NGS),Paired-end short-read mapping,Application specific integrated circuit (ASIC),Field programmable gate array (FPGA), | en |
| dc.relation.page | 54 | - |
| dc.identifier.doi | 10.6342/NTU202304338 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2025-08-06 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 電子工程學研究所 | - |
| dc.date.embargo-lift | N/A | - |
| 顯示於系所單位: | 電子工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 未授權公開取用 | 6.14 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
