請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78792
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 盧奕璋(Yi-Chang Lu) | |
dc.contributor.author | Mao-Jan Lin | en |
dc.contributor.author | 林茂然 | zh_TW |
dc.date.accessioned | 2021-07-11T15:19:41Z | - |
dc.date.available | 2021-06-24 | |
dc.date.copyright | 2019-06-24 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-06-19 | |
dc.identifier.citation | [1] Saul B Needleman and Christian D Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular
biology, 48(3):443–453, 1970. [2] Temple F Smith and Michael S Waterman. Identification of common molecular subsequences. Journal of molecular biology, 147(1):195–197, 1981. [3] Richard Bellman. The theory of dynamic programming. Bulletin of the American Mathematical Society, 60(6):503–515, 1954. [4] Stephen F Altschul, Thomas L Madden, Alejandro A Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J Lipman. Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic acids research, 25(17):3389–3402, 1997. [5] William R Pearson. [5] rapid and sensitive sequence comparison with fastp and fasta. 1990. [6] Heng Li and Richard Durbin. Fast and accurate short read alignment with burrows–wheeler transform. bioinformatics, 25(14):1754–1760, 2009. [7] Adam R Galper and Douglas L Brutlag. Parallel similarity search and alignment with the dynamic programming method. Knowledge Systems Laboratory, Medical Computer Science, Stanford University, 1990. [8] Mengyao Zhao, Wan-Ping Lee, Erik P Garrison, and Gabor T Marth. Ssw library: an simd smith-waterman c/c++ library for use in genomic applications. PloS one, 8(12):e82138, 2013. [9] Richard J Lipton and Daniel Lopresti. A systolic array for rapid string comparison. In Proceedings of the Chapel Hill Conference on VLSI, pages 363–376. Chapel Hill NC, 1985. [10] Elizabeth W Edmiston, Nolan G Core, Joel H Saltz, and Roger M Smith. Parallel processing of biological sequence comparison algorithms. International Journal of Parallel Programming, 17(3):259–275, 1988. [11] Manjit Borah, Raminder Singh Bajwa, Sridhar Hannenhalli, and Mary Jane Irwin. A simd solution to the sequence comparison problem on the mgap. In Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP’94), pages 336–345. IEEE, 1994. [12] E Chow, T Hunkapiller, J Peterson, and MS Waterman. Biological information signal processor. In Proceedings of the International Conference on Application Specific Array Processors, pages 144–160. IEEE, 1991. [13] Zubair Nawaz, Todor Stefanov, and Koen Bertels. Efficient hardware generation for dynamic programming problems. In 2009 International Conference on Field-Programmable Technology, pages 348–352. IEEE, 2009. [14] Daniel P Lopresti. Rapid implementation of a genetic sequence comparator using field-programmable logic arrays. Advanced Research in VLSI 1991: Santa Cruz, pages 139–152, 1991. [15] Dzung T Hoang. Searching genetic databases on splash 2. In [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines, pages 185–191. IEEE, 1993. [16] Steven A Guccione and Eric Keller. Gene matching using jbits. In International Conference on Field Programmable Logic and Applications, pages 1168–1171. Springer, 2002. [17] Chi Wai Yu, KH Kwong, Kin-Hong Lee, and Philip Heng Wai Leong. A smithwaterman systolic cell. In International Conference on Field Programmable Logic and Applications, pages 375–384. Springer, 2003. [18] H-M Bluthgen and Tobias G Noll. A programmable processor for approximate string matching with high throughput rate. In Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors, pages 309–316. IEEE, 2000. [19] Tom Van Court and Martin C Herbordt. Families of fpga-based accelerators for approximate string matching. Microprocessors and Microsystems, 31(2):135–145, 2007. [20] Reed A Cartwright. Logarithmic gap costs decrease alignment accuracy. BMC bioinformatics, 7(1):527, 2006. [21] Zubair Nawaz, Muhammad Nadeem, Hans van Someren, and Koen Bertels. A parallel fpga design of the smith-waterman traceback. In 2010 International Conference on Field-Programmable Technology, pages 454–459. IEEE, 2010. [22] Nuno Sebastiao, Tiago Dias, Nuno Roma, and Paulo Flores. Integrated accelerator architecture for dna sequences alignment with enhanced traceback phase. In 2010 International Conference on High Performance Computing & Simulation, pages 16–23. IEEE, 2010. [23] Nuno Sebastião, Nuno Roma, and Paulo Flores. Configurable and scalable class of high performance hardware accelerators for simultaneous dna sequence alignment. Concurrency and Computation: Practice and Experience, 25(10):1319–1339, 2013. [24] Dzung T Hoang and Daniel P Lopresti. Fpga implementation of systolic sequence alignment. In International Workshop on Field Programmable Logic and Applications, pages 183–191. Springer, 1992. [25] Guilherme L Moritz, Cristiano Jory, Heitor S Lopes, and Carlos R Erig Lima. Implementation of a parallel algorithm for protein pairwise alignment using reconfigurable computing. In 2006 IEEE International Conference on Reconfigurable Computing and FPGA’s (ReConFig 2006), pages 1–7. IEEE, 2006. [26] Yoshiki Yamaguchi, Yosuke Miyajima, Tsutomu Maruyama, and Akihiko Konagaya. High speed homology search using run-time reconfiguration. In International Conference on Field Programmable Logic and Applications, pages 281–291. Springer, 2002. [27] Juan M Marmolejo-Tejada, Vladimir Trujillo-Olaya, Claudia Patricia Rentería-Mejía, and Jaime Velasco-Medina. Hardware implementation of the smith-waterman algorithm using a systolic architecture. In 2014 IEEE 5th Latin American Symposium on Circuits and Systems, pages 1–4. IEEE, 2014. [28] Scott Lloyd and Quinn O Snell. Hardware accelerated sequence alignment with traceback. International Journal of Reconfigurable Computing, 2009:9, 2009. [29] MO Dayhoff, RM Schwartz, and BC Orcutt. 22 a model of evolutionary change in proteins. In Atlas of protein sequence and structure, volume 5, pages 345–352. National Biomedical Research Foundation Silver Spring, 1978. [30] Steven Henikoff and Jorja G Henikoff. Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences, 89(22):10915–10919, 1992. [31] Shannon Steinfadt. Fine-grained parallel implementations for swamp+ smith–waterman alignment. Parallel Computing, 39(12):819–833, 2013. [32] Enzo Rucci, Carlos Garcia, Guillermo Botella, Armando De Giusti, Marcelo Naiouf, and Manuel Prieto-Matias. Swifold: Smith-waterman implementation on fpga with opencl for long dna sequences. BMC systems biology, 12(5):96, 2018. [33] Kazutaka Katoh and Daron M Standley. Mafft multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution, 30(4):772–780, 2013. [34] 1000 Genome Project Data Processing Subgroup, Alec Wysoker, Bob Handsaker, Gabor Marth, Goncalo Abecasis, Heng Li, Jue Ruan, Nils Homer, Richard Durbin, and Tim Fennell. The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16):2078–2079, 06 2009. [35] Sitao Huang, Gowthami Jayashri Manikandan, Anand Ramachandran, Kyle Rupnow, Wen-mei W Hwu, and Deming Chen. Hardware acceleration of the pair-hmm algorithm for dna variant calling. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pages 275–284. ACM, 2017. [36] Julie D Thompson, Patrice Koehl, Raymond Ripp, and Olivier Poch. Balibase 3.0: latest developments of the multiple sequence alignment benchmark. Proteins: Structure, Function, and Bioinformatics, 61(1):127–136, 2005. [37] Chuong B Do, Mahathi SP Mahabhashyam, Michael Brudno, and Serafim Batzoglou. Probcons: Probabilistic consistency-based multiple sequence alignment. Genome research, 15(2):330–340, 2005. [38] Jimin Pei and Nick V Grishin. Mummals: multiple sequence alignment improved by using hidden markov models with local structural information. Nucleic acids research, 34(16):4364–4374, 2006. [39] Orla O’Sullivan, Karsten Suhre, Chantal Abergel, Desmond G Higgins, and Cedric Notredame. 3dcoffee: combining protein sequences and structures within multiple sequence alignments. Journal of molecular biology, 340(2):385–395, 2004. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78792 | - |
dc.description.abstract | 序列排序(sequence alignment)是生物資訊學中一個非常重要的問題,而基於動態規劃的序列映射器可以保證其排序結果為數學上的最佳解,因此有其應用上的不可取代性。然而動態規劃的時間複雜度使相關演算法在面對大量生物資訊時會耗費相當多的時間。過去有許多研究針對動態規劃設計硬體平行加速映射器,但由於記憶體上的限制,通常只提供排序最高分的結果,較少研究強調映射器的回溯(traceback)功能。
本論文提出一專用積體電路(Application-Specific Integrated Circuits, ASIC)設計之硬體加速器,此設計具備仿射間隙評分法(Affine Gap Penalty)的回溯功能。本論文提出兩個方向可以減少序列回溯時的記憶體使用量,並能幫助映射器更好的管線化(pipeline)時序。 本加速器使用台積電40奈米製程實作一個能夠排序兩條最長1024 x 992序列的動態規劃映射器,映射器可以輸入使用不同的蛋白質計分表。電路運作頻率為357 MHz,晶片大小$6.879 mm^2$,功率438.5 mW。在硬體模擬實驗中,我們使用硬體映射器加速MAFFT G-INS-1演算法中最耗費時間的配對排序(pairwise alignment)步驟,包含動態規劃與回溯所花費的時間比起軟體約加速570倍。 | zh_TW |
dc.description.abstract | Sequence alignment plays an important role in bioinformatics. Since the alignment algorithms based on dynamic programming (DP) guarantee the optimal alignment results corresponding to the scoring system used, DP alignment algorithms have been popular since their invention. However, due to the time complexity of DP, the related algorithms consume a lot of time with rapidly increasing sizes of bioinformatics data. The situation induced many hardware parallelization design on DP aligner. However, due to the memory size limitation, few researches focus on the traceback function of DP sequence alignment algorithms.
In this thesis, we propose an application-specific integrated circuits (ASIC) design on DP protein sequence aligner with affine gap traceback function. The author proposes two methods that can reduce the memory usage of traceback function and improve the pipeline timing of the design. The hardware is implemented with TSMC 40 nm technology. The maximum input sequence size of the aligner is $1024 imes 992$. The aligner can take different scoring table according to the input. The circuit operates at $357 MHz$, sized $6.879 mm^2$ with power $438.5 mW$. In post-layout simulation, we test our aligner design on the pairwise alignment step of G-INS-1 algorithm in MAFFT. The whole process including sequence alignment and traceback can be accelerated more than $570$ times comparing to software. | en |
dc.description.provenance | Made available in DSpace on 2021-07-11T15:19:41Z (GMT). No. of bitstreams: 1 ntu-108-R05943039-1.pdf: 8131871 bytes, checksum: e03265e79c9dfa0421f38eea2b9a5ca1 (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | 誌謝. . . . . . . . . . . . . . . . . . . . . iii
摘要. . . . . . . . . . . . . . . . . . . . . v Abstract. . . . . . . . . . . . . . . . . . . vii 1 緒論. . . . . . . . . . . . . . . . . . . . 1 1.1 論文架構. . . . . . . . . . . . . . . . . 3 2 相關研究. . . . . . . . . . . . . . . . . . 5 2.1 動態規畫之硬體加速設計. . . . . . . . . . 5 2.2 具備回溯功能之動態規畫硬體加速設計. . . . 8 3 演算法. . . . . . . . . . . . . . . . . . 11 3.1 演算法概述. . . . . . . . . . . . . . . . 11 3.2 計算排序之最佳分數. . . . . . . . . . . . 11 3.3 序列回溯. . . . . . . . . . . . . . . . . 17 3.3.1 記錄線性評分法之序列回溯. . . . . . . . 18 3.3.2 記錄仿射間隙評分法之序列回溯. . . . . . 18 3.4 記錄仿設間隙評分法回溯方向之演算法改動. . 21 4 硬體設計. . . . . . . . . . . . . . . . . . 27 4.1 硬體設計概述. . . . . . . . . . . . . . . 27 4.2 硬體架構設計. . . . . . . . . . . . . . . 27 4.3 輸入輸出之編碼. . . . . . . . . . . . . . 29 4.4 記錄檢索序列與參考序列之記憶體. . . . . . 31 4.5 計算排序最佳分數模組硬體設計. . . . . . . 33 4.6 記錄回溯方向之SRAM 配置. . . . . . . . . .36 4.7 回溯. . . . . . . . . . . . . . . . . . . 41 4.8 硬體實驗結果. . . . . . . . . . . . . . . 51 5 結論與展望. . . . . . . . . . . . . . . . . 55 5.1 結論. . . . . . . . . . . . . . . . . . . 55 5.2 未來展望. . . . . . . . . . . . . . . . . 55 Bibliography. . . . . . . . . . . . . . . . . 57 | |
dc.language.iso | zh-TW | |
dc.title | 具備仿射間隙評分法回溯功能之序列映射器平行架構設計 | zh_TW |
dc.title | A Parallel Design of Dynamic Programming Sequence Aligner with Affine Gap Traceback | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 劉宗德(Tsung-Te Liu),陳和麟(Ho-Lin Chen),曾宇鳳(Yu-Feng Tseng) | |
dc.subject.keyword | 動態規劃序列映射器,回溯,硬體設計, | zh_TW |
dc.subject.keyword | dynamic programming,sequence aligner,traceback,hardware, | en |
dc.relation.page | 61 | |
dc.identifier.doi | 10.6342/NTU201900954 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2019-06-20 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電子工程學研究所 | zh_TW |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-R05943039-1.pdf 目前未授權公開取用 | 7.94 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。