藉由條件隨機域探勘運動領域的故事情節

Hui-Wen Cheng; 鄭惠文

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68582

標題:	藉由條件隨機域探勘運動領域的故事情節 Mining Sport Event Storylines Based On Conditional Random Fields Model
作者:	Hui-Wen Cheng 鄭惠文
指導教授:	李瑞庭
關鍵字:	故事情節探勘,NBA新聞文章,條件隨機域模型,事件擷取,名稱實體樹, storyline mining,NBA news,conditional random fields model,event extraction,name entity tree,
出版年 :	2017
學位:	碩士
摘要:	隨著體育賽事的大眾化，越來越多的運動都擁有廣大的粉絲，很多粉絲會想知道今天最喜歡的球隊或球員發生了什麼事，很多球隊經理也會想知道怎麼招募、訓練、交易球員才可以得到最好的表現，新聞文章是他們獲得這類情報一個很重要的來源，但隨著網際網路的快速發展，網路上的體育新聞數以萬計。因此，我們提出了一個研究架構幫助他們快速整理體育新聞中的重要故事情節，讓他們能隨時掌握每天發生的每一個體育事件的發展走向。所提出的研究架構分為三部份，首先，我們對新聞文章做前處理，將文章分割成段落；接著，我們利用了體育新聞文章中擁有的特定事件種類和種類中的關鍵字，並搭配條件隨機域模型得到每個段落在每個事件種類的隸屬程度，最後，利用在體育新聞文章中很常被提到的名稱實體捕捉段落之間的相似性，我們建構了名稱實體樹來計算每個名稱實體的距離，得到最後的故事情節為圖形結構。實驗結果顯示我們的方法在兩個指標上面都勝過SteinerTree方法，因為我們的方法善用了運動領域的獨有的特色，且在故事情節的結構上面更具有表達力。本研究所提出的針對特定主題的故事情節，可以使球迷與球隊管理階層對每天眾多體育事件做更快速的掌握與運用。 In this thesis, we propose a framework to mine sport event storylines from a collection of news articles. The proposed framework contains three phases. First, we preprocess the news articles by removing stop words, normalizing the variants for each word, and extracting name entities. Next, we employ a conditional random fields model to label the event category of each word in news paragraphs and derive an event vector for each news paragraph. Finally, we use the event vectors derived and name entities to compute the similarity between news paragraphs, and then generate the storylines for a given query. The experiment results show that the proposed framework outperforms the comparing method and can generate better understanding storylines. The proposed framework can help obtain some valuable insights for sport fans and team managers.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68582
DOI:	10.6342/NTU201703906
全文授權:	有償授權
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	1.35 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。