Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64251
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳信希(Hsin-Hsi Chen)
dc.contributor.authorKai-Chun Changen
dc.contributor.author張凱淳zh_TW
dc.date.accessioned2021-06-16T17:36:57Z-
dc.date.available2015-08-17
dc.date.copyright2012-08-17
dc.date.issued2012
dc.date.submitted2012-08-15
dc.identifier.citation[1] Ido Dagan, O.G.a.B.M.: ‘The PASCAL Recognising Textual Entailment Challenge’, Lecture Notes in Computer Science, 2006, 3944/2006, pp. 177-190
[2] http://aclweb.org/aclwiki/index.php?title=Textual_Entailment_Portal
[3] Danilo Giampiccolo, H.T.D., Bernardo Magnini, Ido Dagan, Elena Cabrio.: ‘The Fourth PASCAL Recognizing Textual Entailment Challenge’. Proc. TAC 2008, Gaithersburg, Maryland, USA2008 pp. Pages
[4] Hideki Shima, H.K., Cheng-Wei Lee, Chuan-Jie Lin, Teruko Mitamura, Yusuke Miyao, Shuming Shi and Koichi Takeda: ‘Overview of NTCIR-9 RITE: Recognizing Inference in TExt’. Proc. NTCIR-9 Workshop Meeting, Tokyo, Japan2011 pp. 291-301
[5] Luisa Bentivogli, I.D., Hoa Trang Dang, Danilo Giampiccolo, Bernardo Magninil: ‘The Fifth PASCAL Recognizing Textual Entailment Challenge’. Proc. TAC 2009, Gaithersburg, Maryland, USA2009 pp. 14-24
[6] Ion Androutsopoulos, P.M.: ‘A Survey of Paraphrasing and Textual Entailment Methods’, Journal of Artificial Intelligence Research, 2010, 38, pp. 135-187
[7] Mark Sammons, V.G.V.V., Dan Roth: ‘'Ask not what Textual Entailment can do for You...'’. Proc. the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Swedan2010 pp. 1199-1208
[8] http://cogcomp.cs.illinois.edu/Data/ACL2010_RTE.php
[9] http://www.csie.ntu.edu.tw/~cjlin/libsvm/
[10] Adrian Iftene, M.-A.M.: ‘UAIC Participation at RTE5’. Proc. TAC 2009, Gaithersburg, Maryland, USA2009 pp. Pages
[11] Group, T.S.N.L.P.: ‘The Stanford Parser: A statistical parser’, (2011, Version 1.6.7 edn.)
[12] Project, N.: ‘Natural Language Toolkit (NLTK)’, (2012, 2.0 edn.)
[13] Manning, M.-C.d.M.a.C.D.: ‘Stanford typed dependencies manual.’ (2008. 2008)
[14] Hen-Hsen Huang, K.-C.C., James M.C. Haver II and Hsin-Hsi Chen: ‘NTU Textual Entailment System for NTCIR 9 RITE Task’. Proc. NTCIR-9, Tokyo, Japan2011 pp. Pages[15] Group, T.S.N.L.P.: ‘Stanford Word Segmenter’, (2011, 1.6 edn.)
[16] Laboratory, N.I.A.: ‘CNP - A ChiNese dependency Parser’, (2011, 1 edn.)
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64251-
dc.description.abstract處理英文文本蘊涵辨識問題的RTE(Recognising Textual Entailment)系列研究已行之有年,在2011年已經舉辦第7屆。2011年在日本NTCIR-9大會中也首度舉辦了文本蘊涵關係辨識的任務:RITE(Recognizing Inference in TExt),並且也是第一次有研究團體發佈繁體中文及簡體中文的文本蘊涵關係研究所需的資料集。足見文本蘊涵辨識研究(Textual Entailment Recognition)廣泛受到注目。
本論文研究哪些語言現象在處理文本蘊涵辨識問題上最為有效,運用Mark Sammons等人在RTE-5 測試資料集上所標記的語言現象作分析與實驗後,我們發現以非正面蘊涵語言現象(Negative Entailment Phenomena)這分類下的語言現象作為特徵值時,對解決文本蘊涵辨識問題相當有效,準確率超過90%。同時,此分類下有五個語言現象又特別突出,分別是支離關聯(Disconnected Relation)、排他引數(Exclusive Argument)、排他關聯(Exclusive Relation)、失蹤引數(Missing Argument)和失蹤關聯(Missing Relation)。
於是我們嘗試用兩種自動化方法:以規則為基礎的方法(rule-based method)及以機器學習為基礎的方法(machine-learning based method),去擷取在文本假說對中的這五個語言現象,並分別測試這兩種自動化方法擷取出的語言現象,在文本蘊涵辨識問題上的效能。與人工標記的結果相比,雖然效能仍有相當大的差距,但也給我們一個努力的方向。
由於過去所有的討論分析都是在英文語料上,我們嘗試對NTCIR-9 RITE任務中所發佈的資料集人工標記,分析後發現在中文語料內也有這些語言現象,並且同樣在文本蘊涵辨識上具有決定性的影響。
zh_TW
dc.description.abstractThe researches on Textual Entailment (TE) have attracted much attention in recent years. RTE (Recognising Textual Entailment in short), a series of evaluations which focus on the developments of English textual entailment recognition technologies, has been held 7 times up to 2011. In 2011, the 9th NTCIR Workshop Meeting first introduced a Textual Entailment task called RITE (Recognizing Inference in TExt in short) into the IR series evaluation. RITE focuses on the Textual Entailment researches in Traditional Chinese, Simplified Chinese, and Japanese. The first ground truth and text-hypothesis pair data set in both Traditional and Simplified Chinese have been distributed.
In this thesis, we concentrate on what kind of phenomena in text-hypothesis pairs would be powerful features to deal with the textual entailment problem. After analyzing and experiments on the dataset distributed by Mark Sammons et al., which is annotated with linguistic phenomena defined by Mark’s research group, we found that the Negative Entailment Phenomena is the most powerful aspect in textual entailment. Accuracy more than 90% was achieved. In this aspect, there are five outstanding phenomena including Disconnected Relation, Exclusive Argument, Exclusive Relation, Missing Argument, and Missing Relation.
Then, we tried to extract the linguistic phenomena from text-hypothesis pairs automatically. Two automatic methods, i.e., rule-based method and machine-learning method, were employed. After applying the phenomena extracted by these automatic methods as features in the TE experiments, the results show that there is a large gap between the human-annotated phenomena and machine-extracted phenomena. There is still room for improvement with this kind of features.
All the above analyses and experiments were made on English data. We aim at knowing whether these important phenomena are also effective in dealing the TE problems on Chinese data or not. Following the similar scheme in English, we annotate the BC-CT text-hypothesis pairs distributed by NTCIR-9 RITE task with the five phenomena. The experiments on both human-annotated and machine-extracted features show these negative entailment phenomena still do well in Chinese.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T17:36:57Z (GMT). No. of bitstreams: 1
ntu-101-R99922108-1.pdf: 1017246 bytes, checksum: 2ec0e16431db46f7c0c85a193cb2981b (MD5)
Previous issue date: 2012
en
dc.description.tableofcontents國立臺灣大學碩士學位論文 口試委員會審定書 i
誌謝 ii
中文摘要 iii
ABSTRACT iv
目錄 vi
附表目錄 viii
附圖目錄 ix
第一章 緒論 1
1.1 研究動機 1
1.2 問題描述 2
1.3 研究目標 4
1.4 論文結構 4
第二章 文獻探討 6
2.1 文本蘊涵辨識任務 6
2.1.1 RTE (Recognising Textual Entailment) 6
2.1.2 RITE (Recognizing Inference in TExt) 7
2.2 目前主要的處理方法 8
2.3 RTE語料中的語言現象 10
2.3.1 “Ask not what Textual Entailment can do for You…” 10
2.3.2 標記的語言現象 11
第三章 語言現象與文本蘊涵辨識 15
3.1 語言現象與文本蘊涵辨識 15
3.1.1 支援向量機 15
3.1.2 特徵值 16
3.1.3 實驗結果 16
3.2 非正面蘊涵語言現象 17
3.2.1 非正向蘊涵語言現象下個別語言現象的表現 17
3.2.2 非正面蘊涵語言現象組合效能 18
3.3 目標語言現象 21
3.3.1 支離關聯(Disconnected Relation) 21
3.3.2 排他引數(Exclusive Argument) 22
3.3.3 排他關聯(Exclusive Relation) 22
3.3.4 失蹤引數(Missing Argument) 24
3.3.5 失蹤關聯(Missing Relation) 25
3.3.6 其他討論 25
第四章 自動化處理與實驗討論 27
4.1 英文使用語料 27
4.2 前置處理(pre-processing) 27
4.3 以自動化方法擷取出語言現象 28
4.3.1 規則式方法(Rule-based Method) 28
4.3.2 機器學習法(Machine-learning Based Method) 32
4.4 實驗討論 37
4.4.1 實驗一:以規則式方法取出的特徵值作二分類蘊涵辨識 37
4.4.2 實驗二:以機器學習法取出的特徵值作二分類蘊涵辨識 40
4.4.3 實驗三:以規則式方法取出的特徵值作中文二分類蘊涵辨識 44
4.4.4 實驗四:非正向蘊涵語言現象在中文語料上的影響 46
第五章 結論與未來展望 53
5.1 結論 53
5.2 未來展望 55
參考文獻 56
dc.language.isozh-TW
dc.subject文本蘊涵zh_TW
dc.subject非正面蘊涵語言現象zh_TW
dc.subject自動化擷取zh_TW
dc.subject規則式方法zh_TW
dc.subject機器學習方法zh_TW
dc.subjectnegative entailment phenomenaen
dc.subjectautomatic extractionen
dc.subjectrule-based methoden
dc.subjectmachine-learning based methoden
dc.subjecttextual entailmenten
dc.title使用非正向蘊涵語言現象研究文本蘊涵zh_TW
dc.titleAnalyses of Negative Entailment Phenomena for Textual Entailment Recognitionen
dc.typeThesis
dc.date.schoolyear100-2
dc.description.degree碩士
dc.contributor.oralexamcommittee盧文祥(Wen-Hsiang Lu),鄭卜壬(Pu-Jen Cheng),林川傑(cjlin@mail.ntou.edu.tw)
dc.subject.keyword文本蘊涵,非正面蘊涵語言現象,自動化擷取,規則式方法,機器學習方法,zh_TW
dc.subject.keywordtextual entailment,negative entailment phenomena,automatic extraction,rule-based method,machine-learning based method,en
dc.relation.page56
dc.rights.note有償授權
dc.date.accepted2012-08-15
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-101-1.pdf
  未授權公開取用
993.4 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved