Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30500
Title: 中文文法之正規近似演算法
Regular Approximation of Context-Free Grammars for Chinese Language
Authors: Chi-Cheng Li
李奇錚
Advisor: 顏嗣鈞
Keyword: 中文文法,自動機,正規語言,
regular language approximation,context-free grammars,Chinese grammar,automata,
Publication Year : 2007
Degree: 碩士
Abstract: Context-free grammar的計算向來需要耗費相當大的時間和空間成本,因此將原本想要計算的context-free grammar轉換成一個運算需求比較小的regular language聽起來會是一個聰明的選擇。在過去所提出的演算法中,以RTN為基底的演算法目前來看是最適合的。此演算法在成本以及效能的考量中,達到一個最佳的平衡。這種演算法還有一種變形,稱之為參數-RTN。參數-RTN會紀錄之前已經走過的路徑,因此,他會花比較多的空間成本,參數為3的效果比2來的好,只是所要付出的空間成本太高了。本文提出一個能有效降低空間需求的演算法,利用這樣的一個演算法,就可以讓參數3甚至是更高的參數變成可以實際應用的。我們主要是利用語言的特性來達到這個目標的,我們可以把中文的每一個詞組之前的詞作分類,例如:S->TN NP,因為TN會被分類成主詞,所以NP可以看成是一個接在主詞之後的句子,當其他的詞也可以被分類成主詞時,若該詞的後面也接一個NP,那這個NP就可以和TN之後的NP做結合如此,當實驗的文法越大時,這樣的結合就會越多,而所能節省的空間自然也就越多。在本文的最後,我們也利用中央研究院所研發的中文解析器(CKIP)來進行我們的實驗,最後分析不同的逼近演算法以及原本的演算法間的差異性,同時也分析這些逼近所需要的空間成本是否符合效益。
As parsing context-free grammars is time-consuming, converting the original grammars into approximated regular languages can reduce the complexity. Based on the approaches, the so-called recursive transition network (RTN, which is an -NFA) is the most adequate approximated algorithm. The refinement is parameter RTN, which is more adequate than the original RTN. The suggested parameter is 2. However, the parameter RTN requires large memories, which limits its role in practical applications.
In this thesis, we propose a refinement of the parameter RTN which can alleviate the memory requirement effectively. The refined RTN (called group RTN) utilizes the prosperities of Chinese. It classifies the terms into different groups. Based on the different groups of terms, we integrate similar states and transitions of RTN into one. Using this scheme, the memory requirement of the RTN method can be reduced considerably.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30500
Fulltext Rights: 有償授權
Appears in Collections:電機工程學系

Files in This Item:
File SizeFormat 
ntu-96-1.pdf
  Restricted Access
433.36 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved