Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5195
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor王勝德(Sheng-De Wang)
dc.contributor.authorChien-Chi Chenen
dc.contributor.author陳建吉zh_TW
dc.date.accessioned2021-05-15T17:53:20Z-
dc.date.available2015-08-08
dc.date.available2021-05-15T17:53:20Z-
dc.date.copyright2014-08-08
dc.date.issued2014
dc.date.submitted2014-08-04
dc.identifier.citation[1] Alfred V. Aho and Margaret J. Corasick. Efficient string matching: an aid to bibliographic search. Commun. ACM, 18(6):333–340, June 1975.
[2] N. Tuck, T. Sherwood, B. Calder, and G. Varghese. Deterministic memory-efficient string matching algorithms for intrusion detection. In INFOCOM 2004. Twentythird AnnualJoint Conference of the IEEE Computer and Communications Societies, volume 4, pages 2628–2639 vol.4, 2004.
[3] Xinyan Zha and S. Sahni. Highly compressed Aho-Corasick automata for efficient intrusion detection. In Computers and Communications, 2008. ISCC 2008. IEEE Symposium on, pages 298–303, 2008.
[4] Mansoor Alicherry, Muthusrinivasan Muthuprasanna, and Vijay Kumar. High speed pattern matching for network IDS/IPS. In Network Protocols, 2006. ICNP’06. Proceedings of the 2006 14th IEEE International Conference on, pages 187–196. IEEE, 2006.
[5] Gerald Tripp. A parallel “string matching engine”for use in high speed network intrusion detection systems. Journal in Computer Virology, 2(1):21–34, 2006.
[6] Derek Pao and Xing Wang. Multi-stride string searching for high-speed content inspection. The Computer Journal, 55(10):1216–1231, 2012.
[7] Vahid Rahmanzadeh and MohammadBagher Ghaznavi-Ghoushchi. A multi-Gb/s parallel string matching engine for intrusion detection systems. In Advances in Computer Science and Engineering, volume 6 of Communications in Computer and Information Science, pages 847–851. Springer Berlin Heidelberg, 2009.
[8] Burton H Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, 1970.
[9] Sarang Dharmapurikar, Praveen Krishnamurthy, Todd Sproull, and John Lockwood. Deep packet inspection using parallel bloom filters. In High Performance Interconnects, 2003. Proceedings. 11th Symposium on, pages 44–51. IEEE, 2003.
[10] Wei Lin and Bin Liu. Pipelined parallel AC-based approach for multi-string matching. In Parallel and Distributed Systems, 2008. ICPADS’08. 14th IEEE International Conference on, pages 665–672. IEEE, 2008.
[11] Derek Pao, Wei Lin, and Bin Liu. A memory-efficient pipelined implementation of the Aho-Corasick string-matching algorithm. ACM Trans. Archit. Code Optim., 7(2):10:1–10:27, October 2010.
[12] Nan Hua, Haoyu Song, and T. V. Lakshman. Variable-stride multi-pattern matching for scalable deep packet inspection. In INFOCOM 2009, IEEE, pages 415–423, 2009.
[13] V. Dimopoulos, I. Papaefstathiou, and D. Pnevmatikatos. A memory-efficient reconfigurable Aho-Corasick FSM implementation for intrusion detection systems. In Embedded Computer Systems: Architectures, Modeling and Simulation, 2007. ICSAMOS 2007. International Conference on, pages 186–193, 2007.
[14] YE Yang, Viktor K Prasanna, and Chenqian Jiang. Head-body partitioned string matching for deep packet inspection with scalable and attack-resilient performance. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1–11. IEEE, 2010.
[15] Michela Becchi and Patrick Crowley. A hybrid finite automaton for practical deep packet inspection. In Proceedings of the 2007 ACM CoNEXT conference, CoNEXT ’07, pages 1:1–1:12. ACM, 2007.
[16] Yutaka Sugawara, Mary Inaba, and Kei Hiraki. Over 10Gbps string matching mechanism for multi-stream packet scanning systems. In Field Programmable Logic and Application, volume 3203 of Lecture Notes in Computer Science, pages 484–493. Springer Berlin Heidelberg, 2004.
[17] N. Yamagaki, R. Sidhu, and S. Kamiya. High-speed regular expression matching engine using multi-character NFA. In Field Programmable Logic and Applications, 2008. FPL 2008. International Conference on, pages 131–136, 2008.
[18] T. Katashita, A. Maeda, K. Toda, and Y. Yamaguchi. A method of generating highly efficient string matching circuit for intrusion detection. In Field Programmable Logic and Applications, 2006. FPL ’06. International Conference on, pages 1–4, 2006.
[19] Daniele Paolo Scarpazza, Oreste Villa, and Fabrizio Petrini. Exact multi-pattern string matching on the Cell/B.E. processor. In Proceedings of the 5th conference on Computing frontiers, CF ’08, pages 33–42. ACM, 2008.
[20] Leena Salmela, Jorma Tarhio, and Jari Kytöjoki. Multipattern string matching with q-grams. Journal of Experimental Algorithmics (JEA), 11:1–1, 2007.
[21] Daniele Paolo Scarpazza, Oreste Villa, and Fabrizio Petrini. High-speed string searching against large dictionaries on the Cell/BE processor. In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages 1–12. IEEE, 2008.
[22] Y-HE Yang and Viktor K Prasanna. Robust and scalable string pattern matching for deep packet inspection on multicore processors. Parallel and Distributed Systems, IEEE Transactions on, 24(11):2283–2292, 2013.
[23] Oreste Villa, Daniel Chavarria-Miranda, and Kristyn Maschhoff. Input-independent, scalable and fast string matching on the Cray XMT. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1–12. IEEE, 2009.
[24] Antonino Tumeo, Oreste Villa, and Donatella Sciuto. Efficient pattern matching on GPUs for intrusion detection systems. In Proceedings of the 7th ACM international conference on Computing frontiers, pages 87–88. ACM, 2010.
[25] Antonino Tumeo, Oreste Villa, and Daniel G Chavarría-Miranda. Aho-Corasick string matching on shared and distributed-memory parallel architectures. Parallel and Distributed Systems, IEEE Transactions on, 23(3):436–443, 2012.
[26] D Herath, C Lakmali, and R Ragel. Accelerating string matching for bio-computing applications on multi-core CPUs. In Industrial and Information Systems (ICIIS), 2012 7th IEEE International Conference on, pages 1–6. IEEE, 2012.
[27] Jennifer Stephenson and Paul Metzgen. Logic Optimization Techniques for Multiplexers. Altera Literature, 2004.
[28] Yi-Hua E Yang, Weirong Jiang, and Viktor K Prasanna. Compact architecture for high-throughput regular expression matching on FPGA. In Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, pages 30–39. ACM, 2008.
[29] C.R. Clark and D.E. Schimmel. Scalable pattern matching for high speed networks. In Field-Programmable Custom Computing Machines, 2004. FCCM 2004. 12th Annual IEEE Symposium on, pages 249–257, 2004.
[30] Reetinder Sidhu and Viktor K Prasanna. Fast regular expression matching using FPGAs. In Field-Programmable Custom Computing Machines, 2001. FCCM’01. The 9th Annual IEEE Symposium on, pages 227–238. IEEE, 2001.
[31] Snort. http://www.test.org/doe/.
[32] Benfano Soewito. Packet inspection on programmable hardware. Computer Engineering and Intelligent Systems, 4(2):57–68, 2013.
[33] Kenneth J Schultz. Content-addressable memory core cells a survey. INTEGRATION, the VLSI journal, 23(2):171–188, 1997.
[34] Kostas Pagiamtzis and Ali Sheikholeslami. Content-addressable memory (cam) circuits and architectures: A tutorial and survey. Solid-State Circuits, IEEE Journal of, 41(3):712–727, 2006.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5195-
dc.description.abstract由 Aho 和 Corasick 所提出的演算法 (簡稱 AC 演算法) 可以很有效率地在一段文字中搜尋多個關鍵字所在的位置,因而被廣泛地使用於完全字串比對。然而,AC 演算法在運作時,只能一次處理一個字元,因而其效能受限於運作時脈。本論文依據 AC 演算法,提出新穎的具有多字元狀態轉移之字串比對架構,可以平行檢視多個字元,藉以加倍提升字串比對的效能。首先,說明實作 AC 演算法的三種有限自動機(finite automaton,簡稱 FA)的作法,包括確定性有限自動機 (Deterministic Finite Automaton,簡稱 DFA)、非確定有限自動機 (Nondeterministic Finite Automaton,簡稱 NFA)、以及複合有限自動機 (Hybrid Finite Automaton,簡稱 hybrid FA) 的作法。接著,提出一種推導演算法,藉以將前述 FA 推導為多字元的 FA,使其能夠平行檢視多個字元。同時,因為平行檢視多個字元所引起的對齊問題 (alignment problem),也藉由輔助轉移函式 (assistant transition) 來解決。所推導的多字元 FA 使用可程式元件來評估,所得到的評估結果顯示其能有效地加倍提升字串比對的效能。其中所得到的最佳結果為 16字元 AC-NFA 的實作,其可運作於 167.36MHz 的時脈,換算得到的字串比對效能為 21.4Gbps。zh_TW
dc.description.abstractThe algorithm of Aho and Corasick (AC-algorithm) can locate multiple keywords in a text efficiently; and thus it is widely used in the exact string matching. However, the AC-algorithm processes the text character by character with a performance limited by the processing clock. This thesis proposes novel string-matching architectures with multiple-character state transitions based on the AC-algorithm, which can inspect multiple characters in parallel to multiply the throughput. At first, three finite automaton (FA) approaches including Deterministic Finite Automaton (DFA), Nondeterministic Finite Automaton (NFA), and hybrid FA approaches are presented to implement the AC-algorithm. Subsequently, a derivation algorithm is proposed to derive these FAs to multi-character FAs to allow for the inspection of multiple characters in parallel. Additionally, the alignment problem, which occurs while multiple characters are being inspected in parallel, is solved by using assistant transitions. The derived multi-character FAs are evaluated on programmable devices and the evaluation results indicate that the throughput can be multiplied effectively. The achievable throughput of the best result is 21.4Gbps obtained by a 16-character AC-NFA implementation operating at a 167.36MHz clock.en
dc.description.provenanceMade available in DSpace on 2021-05-15T17:53:20Z (GMT). No. of bitstreams: 1
ntu-103-D93921012-1.pdf: 2083842 bytes, checksum: 8b5b9aadaee79c185aadfb9529b2a7f8 (MD5)
Previous issue date: 2014
en
dc.description.tableofcontents致謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Exact String Matching and Motivations . . . . . . . . . . . . . . 1
1.2 Approach Overview . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Optimization in Hardware . . . . . . . . . . . . . . . . . . . . 5
2.2 Multi-Character Hardware Approaches . . . . . . . . . . . . . . . 6
2.3 Software Approaches . . . . . . . . . . . . . . . . . . . . . . . 7
3 Aho-Corasick Algorithm and Implementations. . . . . . . . . . . . . 9
3.1 AC-trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 AC-DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 AC-NFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Hybrid AC-FA . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Priority Multiplexer . . . . . . . . . . . . . . . . . . . . . . 21
4 Multi-Character State Transitions . . . . . . . . . . . . . . . . . 23
4.1 Alignment Problem and Previous Work . . . . . . . . . . . . . . . 23
4.2 Derivation of Multi-Character Transitions . . . . . . . . . . . . 25
4.3 Derivation Algorithm . . . . . . . . . . . . . . . . . . . . . . 29
4.4 Comparison of Derivation Approaches . . . . . . . . . . . . . . . 31
4.4.1 The proposed approach . . . . . . . . . . . . . . . . . . . . . 31
4.4.2 Yamagaki et al. approach . . . . . . . . . . . . . . . . . . . 31
4.4.3 Yang et al. approach . . . . . . . . . . . . . . . . . . . . . 33
4.4.4 Summary of comparison . . . . . . . . . . . . . . . . . . . . . 34
5 Multi-Character String-Matching Approaches . . . . . . . . . . . . 37
5.1 Multi-Character AC-DFA . . . . . . . . . . . . . . . . . . . . . 38
5.1.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.2 Matching example . . . . . . . . . . . . . . . . . . . . . . . 44
5.1.3 Experiment and evaluation . . . . . . . . . . . . . . . . . . . 44
5.2 Multi-Character AC-NFA . . . . . . . . . . . . . . . . . . . . . 48
5.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2.2 Matching example . . . . . . . . . . . . . . . . . . . . . . . 51
5.2.3 Experiment and evaluation . . . . . . . . . . . . . . . . . . . 52
5.3 Multi-Character Hybrid AC-FA . . . . . . . . . . . . . . . . . . 56
5.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.2 Matching example . . . . . . . . . . . . . . . . . . . . . . . 59
5.3.3 Estimation of required stages . . . . . . . . . . . . . . . . . 60
5.3.4 Experiment and evaluation . . . . . . . . . . . . . . . . . . . 62
6 Configurable Architectures . . . . . . . . . . . . . . . . . . . . 63
6.1 Configurable stage architecture . . . . . . . . . . . . . . . . . 63
6.1.1 Block diagram . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.1.2 Experiment and evaluation . . . . . . . . . . . . . . . . . . . 65
6.2 Configurable data-width architecture . . . . . . . . . . . . . . 68
6.2.1 Basic rule unit . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2.2 Example of three-unit group . . . . . . . . . . . . . . . . . . 71
6.2.3 Waveform of three-unit group . . . . . . . . . . . . . . . . . 72
6.2.4 Experiment and evaluation . . . . . . . . . . . . . . . . . . . 73
7 Comparison of approaches . . . . . . . . . . . . . . . . . . . . . 75
7.1 Comparing Results . . . . . . . . . . . . . . . . . . . . . . . . 75
7.2 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
dc.language.isoen
dc.subject入侵偵測系統zh_TW
dc.subject字串比對zh_TW
dc.subject確定性及非確定有限自動機zh_TW
dc.subjectString-matchingen
dc.subjectdeterministic and nondeterministic finite automatonen
dc.subjectintrusion detection systemen
dc.title具多字元狀態轉移之高效字串比對引擎zh_TW
dc.titleEfficient String Matching Engines with Multiple-Character State Transitionsen
dc.typeThesis
dc.date.schoolyear102-2
dc.description.degree博士
dc.contributor.oralexamcommittee雷欽隆(Chin-Laung Lei),顏嗣鈞(Hsu-Chun Yen),陳銘憲(Ming-Syan Chen),曾俊元(C. Henry Tseng)
dc.subject.keyword字串比對,確定性及非確定有限自動機,入侵偵測系統,zh_TW
dc.subject.keywordString-matching,deterministic and nondeterministic finite automaton,intrusion detection system,en
dc.relation.page84
dc.rights.note同意授權(全球公開)
dc.date.accepted2014-08-04
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-103-1.pdf2.04 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved