Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5164| Title: | 內容串流中實體特性偵測之研究 Detection of Entity Properties in Content Stream |
| Authors: | Qing-Cheng Li 李卿澄 |
| Advisor: | 陳信希(Hsin-Hsi Chen) |
| Keyword: | 知識庫加速,樣式比對,實體特性偵測, Knowledge Base Acceleration,Pattern Matching,Entity’s Property Detection, |
| Publication Year : | 2014 |
| Degree: | 碩士 |
| Abstract: | 世界上的知識日新月異,透過志願編輯者更新的知識庫無法跟上知識產生與改變的速度,如何縮短知識產生與知識庫更新間的差距,也就是知識庫加速,便成為了重要的議題。
知識庫中記載的實體與其特性也是相當重要的知識,本研究提出了基於樣式,自資訊匯集而成之內容串流中快速地偵測文件是否包含特定實體特性的方法。偵測流程包含了樣式比對、樣式篩選與特性消歧義等步驟。透過樣式比對與實體特性與樣式的關聯偵測實體特性,存在樣式的品質、可信賴度、對映特性的歧義等問題,本研究於樣式比對前進行樣式篩選,比對後進行特性消歧義以降低上述問題的影響。 實驗結果分析了樣式信心值、可信賴度、特性歧義度對效能造的影響,發現特性消歧義的步驟中,引入實體類型資訊與使用簡單貝氏分類器後,偵測效能有顯著的提升。 透過實體特性的偵測,有助於自內容串流中篩選對知識庫更新有幫助的文章,以供志願編輯者作為更新與維護知識庫的依據。 World knowledge varies with time, but the change of knowledge about an entity often waits for a long time before a human editor update it in knowledge base (KB). How to accelerate the update of KB is an important problem, it’s also called knowledge base acceleration (KBA). In this paper, we propose a method that detects entity’s properties in content stream efficiently and effectively base on patterns. The detection process has three phases including pattern selection phase, pattern matching phase and property disambiguation phase. pattern quality, reliability and ambiguity are three major issues in the process. The experimental results show the impact of patterns’ confidence value, reliability and ambiguity degree. We found that using the entity type information and naive bayes classifier improve the performance of the detection system. Detection of entity’s properties filters documents from content stream. It’s helpful for human editors to use the information in those documents to update the KB. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5164 |
| Fulltext Rights: | 同意授權(全球公開) |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-103-1.pdf | 5.84 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
