請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41617完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 王勝德(Sheng-De Wang) | |
| dc.contributor.author | Chia-Hung Chen | en |
| dc.contributor.author | 陳家宏 | zh_TW |
| dc.date.accessioned | 2021-06-15T00:24:57Z | - |
| dc.date.available | 2010-02-03 | |
| dc.date.copyright | 2009-02-03 | |
| dc.date.issued | 2009 | |
| dc.date.submitted | 2009-01-22 | |
| dc.identifier.citation | Bibliography
[1] D. Veillard. Libxml2 project web page, http://xmlsoft.org/. [2] Datapower, http://www.datapower.com/. [3] Document Object Model, http://www.w3.org/dom/. [4] Simple API for XML, http://www.saxproject.org/. [5] UTF-8 standard: RFC3629, http://tools.ietf.org/html/rfc3629. [6] XML Path Language Version 1.0, http://www.w3.org/tr/xpath. [7] M. J. L. N. Abu-Ghazaleh. Di erential Deserialization for Optimized SOAP Performance. In SC05: High performance computing, networking, and storage conference, Nov. 2005. [8] J. B. B. C. C. L. Jan van Lunteren, Ton Engbersen. XML Accelerator Engine. In The First International Workshop on High Performance XML Processing, 2004. [9] W. L. A. S. K. Chiu, T. Devadithya. A Binary XML for Scienti c Applications. In International Conference on e-Science and Grid Computing, 2005. [10] W. L. Kenneth Chiu. A compiler-based approach to schema-speci c XML parsing. In The First International Workshop on High Performance XML Processing, 2004. [11] W. L. Markus L. Noga, Ste en Schott. Lazy XML processing. In Proceedings of the 2002 ACM symposium on Document engineering, 2002. [12] J. J. Matthias Nicola. XML Parsing: A Threat to Database Performance. In Conf. Information and Knowledge Management (CIKM 03), 2003. [13] H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, page 30, 2005. [14] R. A. van Engelen. Constructing Finite State Automata for High-Performance XMLWeb Services. In Proceedings of the International Symposium on Web Services(ISWS), 2004. [15] Y. P. Wei Lu, Kennneth Chiu. A Parallel Approach to XML Parsing. In The 7th IEEE/ACM International Conference on Grid Computing, 2006. [16] Y. Z. K. C. Y. Pan, W. Lu. A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs. In 7th IEEE International Symposium on Cluster Computing and the Grid, 2007. [17] J. Zhang. Process XML on a chip. In CommsdDesign Magazine, 2005. 4 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41617 | - |
| dc.description.abstract | 在2008年時,可擴展標記語言(XML),由於其可擴展的特性,已經在全世界廣泛地使用,作為不同的電腦軟體間,最常用的文件交換標準。電腦工業界預測,可擴展標記語言將會在未來的數十年間,被更廣泛地採用。然而,也由於可擴展標記語言對於人眼而言,非常可讀且清楚的特性,電腦要對其作語法解析,會消耗大量的時間和記憶體。
虛擬標記描述子(VTD)是種處理可擴展標記語言的新方法。它最大的特色在於,它是種非抽取式的處理方法:它不會將文件中的資料自檔案中取出,以建構自身需要的資料結構。虛擬標記描述子僅僅記錄文件中,部分的後設資訊,例如一些標記在檔案中的位址。它可以幾乎媲美文件物件模型的處理能力,但使用較少的運算資源。 本文提出一個以虛擬標記描述子為目標輸出之可擴展標記語言的語法分析器之硬體實現,並且針對我們設計該高速語法分析器之技術做分析與討論。該語法分析器,是受限於硬體而特別設計的:它無法偵測文件中,本身的文法錯誤,但是它可以用非常快的速度做語法分析。硬體合成的數據和實驗結果顯示,在平均的情況下,這個硬體的語法分析器能以每秒處理三十億位元的速度,來分析XML文件。 | zh_TW |
| dc.description.abstract | At the year 2008, XML (Extensible Markup Language) has been globally used as the most common and standard exchange format between different software because of its extensible characteristic. The computer industry predicts that XML will be used more and more in the future decades. However, because the format of XML is very clear and understandable to human, to parse XML costs a lot of time or memory.
Virtual Token Descriptor (VTD) is a new method of processing XML. Its most special characteristic is that it is a non-extractive method: it does not extract the data from original XML file to build its own data structure. VTD only records certain important meta-information such as the offsets of some tags. It can achieve almost all the functionalities of DOM, but with fewer resources. This thesis presents a hardware implementation of the VTD XML parser, and discuss about the mechanisms that we use to create a high speed VTD XML parser. This parser is specially designed because of the hardware limitation: it cannot detect the errors inside the XML document, but it is capable of doing high-speed parsing. The synthesis data and experimental results show that this parser can process XML document at a speed of 3Gbs on average case. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T00:24:57Z (GMT). No. of bitstreams: 1 ntu-98-R95921122-1.pdf: 1179622 bytes, checksum: e4ee4a115d696e4d550f0d6fb2ce96b6 (MD5) Previous issue date: 2009 | en |
| dc.description.tableofcontents | Contents
1 Introduction .................. 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 The Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background and Problem Analysis .................. 5 2.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 UTF-8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 The Synthesis of Decoder . . . . . . . . . . . . . . . . . . . 6 2.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.1 The Difficulty of Processing UTF-8 . . . . . . . . . . . . . . 7 2.2.2 The Difficulty of Large Finite State Machine . . . . . . . . . 8 2.2.3 The Difficulty of Cutting Pipeline Stage . . . . . . . . . . . . 8 3 Related Works .................. 10 3.1 Virtual Token Descriptor . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.1 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.2 Imagining VTD Records as A Relational Database . . . . . 10 3.2 Parallel XML Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 System Architecture .................. 16 4.1 The Merging Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.1 Merging with UTF-8 . . . . . . . . . . . . . . . . . . . . . . 18 4.1.2 Special Cyclic Buffer . . . . . . . . . . . . . . . . . . . . . . 21 4.1.3 Halting Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2 The Rule-matching Stage . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.1 Parallel Architecture . . . . . . . . . . . . . . . . . . . . . . 23 4.2.2 TEXT Processing . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.3 Deep into The Rule-matching . . . . . . . . . . . . . . . . . 25 4.3 The Recording Stage . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 Evaluation Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 Discussion .................. 29 5.1 The Decoupling with The Error Detection . . . . . . . . . . . . . . 29 5.1.1 Character Validation . . . . . . . . . . . . . . . . . . . . . . 29 5.1.2 Matching of The Start Tag and The End Tag . . . . . . . . 30 5.1.3 Uniqueness of Attribute . . . . . . . . . . . . . . . . . . . . 30 5.2 Modified Implementation of VTD Records . . . . . . . . . . . . . . 30 5.3 Vector Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.4 Partial Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.5 Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6 Experimental Results .................. 35 6.1 Synthesis Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7 Conclusion ..................39 7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 8 Bibliography .................. 41 | |
| dc.language.iso | en | |
| dc.subject | 硬體加速 | zh_TW |
| dc.subject | 可擴展標記語言 | zh_TW |
| dc.subject | 語法分析器 | zh_TW |
| dc.subject | 萬國碼 | zh_TW |
| dc.subject | UTF-8 | en |
| dc.subject | Parser | en |
| dc.subject | XML | en |
| dc.subject | VTD | en |
| dc.subject | hardware | en |
| dc.subject | indexer | en |
| dc.title | 以虛擬標記描述子為目標輸出之可擴展標記語言的語法分析器之硬體加速 | zh_TW |
| dc.title | Hardware Accelerated XML Parser for Virtual Token Descriptor | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 97-1 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 洪士灝,鍾國亮,林彥君 | |
| dc.subject.keyword | 可擴展標記語言,硬體加速,語法分析器,萬國碼, | zh_TW |
| dc.subject.keyword | Parser,XML,VTD,hardware,indexer,UTF-8, | en |
| dc.relation.page | 42 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2009-01-23 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
| 顯示於系所單位: | 電機工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-98-1.pdf 未授權公開取用 | 1.15 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
