請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28054
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 簡立峰(Lee-Feng Chien) | |
dc.contributor.author | Ting-Wei Chuang | en |
dc.contributor.author | 莊庭瑋 | zh_TW |
dc.date.accessioned | 2021-06-12T18:35:57Z | - |
dc.date.available | 2008-08-03 | |
dc.date.copyright | 2007-08-03 | |
dc.date.issued | 2007 | |
dc.date.submitted | 2007-07-30 | |
dc.identifier.citation | 參考文獻
[1] ChatTrack: http://moby.ittc.ku.edu/chattrack/ [2] Mobile01 Site http://www.mobile01.com/ [3] MySpace Site http://www.myspace.com/ [4] 批踢踢實業坊(Professional Technology Temple),telnet://ptt.cc [5] Topic Detection and Tracking Project Site http://www.nist.gov/speech/tests/tdt/ [6] TDT 2004 : Annotation Manual Version 1.2 - August 2004, http://www.nist.gov/speech/tests/tdt/ [7] J. Allan, L. Ballesteros, et al., Recent experiments with inquery. In The Fourth Text REtrieval Conference (TREC-4), p.49-63, 1995 [8] J. Allan, J. Carbonell, et al., Topic detection and tracking pilot study. In Proc. of DARPA Broadcast News Transcription and Understanding Workshop, p. 194--218, 1998 [9] J. Allan, H. Jin, et al., Summer workshop final report. In Center for Language and Speech Processing, 1999. [10] J. Allan, V. Lavrenko, et al., First story detection in TDT is hard, In Proc. of the 9th international conference on Information and knowledge management, p.374-381, 2000 [11] J. Allan, C. Wade, et al., Retrieval and novelty detection at the sentence level, In Proc. of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, p.314-321, 2003 [12] J. Allan, R. Papka, et al., On-line New Event Detection and Tracking,, In Proc. of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, p. 37-45, 1998 [13] J. Bengel, S. Gauch, et al., Chattrack: Chat room topic detection using classification. In 2nd Symposium on Intelligence and Security Informatics, p. 266-277, 2004 [14] Ella Bingham , Ata Kabán , et al., Topic Identification in Dynamical Text by Complexity Pursuit, Neural Processing Letters, v.17 n.1, p.69-83, 2003 [15] T. Brants , F. Chen, A System for new event detection, In Proc. of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, p.330-337, 2003 [16] I. S. Dhillon, Co-clustering documents and words using bipartite spectral graph partitioning, In Proc.of the 7th ACM SIGKDD international conference on Knowledge discovery and data mining, p.269-274, 2001 [17] M. Franz, A. Ittycheriah, et al., First story detection: Combining similarity and novelty-based approaches. Slides at the TDT-2001 meeting, IBM, 2001 [18] F. M. Khan, T. A. Fisher, et al., Mining chatroom conversations for social and semantic interactions, Technical Report, Lehigh University, 2002. [19] J. W. Kim , K. S. Candan , et al., Topic segmentation of message hierarchies for indexing and navigation support, In Proc. of the 14th international conference on World Wide Web, p.322-331, 2005 [20] G. Kumaran , J. Allan, Text classification and named entities for new event detection, In Proc. of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, p.297-304, 2004 [21] A. Murakami, K. Nagao, et al., Discussion Mining: Knowledge Discovery from Online Discussion Records, The 1st NLP and XML Workshop, 2001. [22] Jian-Yun Nie , Jiangfeng Gao , et al., On the use of words and n-grams for Chinese information retrieval. In Proc. of the 5th international workshop on on Information retrieval with Asian languages, p.141-148, 2000 [23] Warren Sack, Conversation Map: A Content-Based Usenet Newsgroup Browser, In Proc. of the 5th international conference on Intelligent user interfaces, p.233-240, 2000. [24] M. Sahami, T. D. Heilman, A Web-based Kernel Function for Measuring the Similarity of Short Texts Snippets. In Proc. of the 15th International Conference on World Wide Web, p.377-386, 2006 [25] D Shen, Q Yang, et al., Thread detection in dynamic text message streams, In Proc. of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, p.35-42, 2006 [26] N. Stokes, J. Carthy, First story detection using a composite document representation, In Proc. of the 1st international conference on Human language technology research, p.1-8, 2001 [27] W Xi , J. Lind , et al., Learning effective ranking functions for newsgroup search, In Proc. of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, p.394-401, 2004 [28] Y Yang , T. Pierce , et al., A study of retrospective and on-line event detection, In Proc. of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, p.28-36, 1998 [29] Y Yang , J Zhang, et al., Topic-conditioned novelty detection, In Proc. of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, p.688-693, 2002 [30] G Xu and WY Ma, Building implicit links from content for forum search Full text, In Proc. of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, p. 300-307, 2006 | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28054 | - |
dc.description.abstract | 網路論壇是虛擬社群成員之間的資訊流通平台,允許多位使用者自由發表及回覆關於特定主題的訊息,這些動態產生的大量文字訊息構成極有價值的知識倉儲。為了幫助使用者取用這些文字訊息中的資訊,目前大多數網路論壇僅提供預先定義的分類標籤作為輔助工具,並依照各訊息的標題以線性的方式呈現。由於訊息標題僅包含極少量資訊,使用者無法藉此區分不同訊息內容間的差異,而在缺乏預先知識的情況下,使用者亦無法產生適當的查詢字詞,因此我們需要一個更有效的組織方式幫助使用者快速地了解所有訊息所構成的瀏覽空間。
本研究嘗試將傳統新事件偵測技術應用於網路論壇環境中,利用文件分群技術,以線上的方式依據各訊息內容相關性分群,並自動偵測是否出現新的訊息分群。我們以漸進式分群演算法為基礎,加入包含時間資訊和網路資源在內的外部資源增加論壇訊息包含的字詞特徵,並取用同分群中的其他訊息做為內文資訊,我們總共提出六種不同的模型作為基本分群演算法的延伸。初步的實驗證明了我們的假設,這些外部資訊的加入確實提升訊息分群的效能。 | zh_TW |
dc.description.abstract | As the emergence of Web forums, tremendous archived messages regarding diverse subjects become valuable knowledge repositories. Unfortunately, the continuously increasing and heterogeneous messages compound the navigation load. The existing human-generated category hierarchies provided by most Web forums are not self-revealing. Besides, the thread structures created according to the subject headers of messages only give limited information about actual contents of messages. Thus, the forum messages are actually lacked for a better organization to guide browsers in the information space.
In this work, we employ the conventional incremental clustering algorithm which is well-studied in TDT as the baseline to group together topically related messages in streams of text messages. In order to detect the topic from one single short message, we propose six different models to exploit temporal information and context information obtained from parent messages. We utilize some extern features extracted from Web as well. Our preliminary experiments confirmed our assumption that incorporate features relevant to the Web forum environment can enhance the performance the basic clustering algorithms. | en |
dc.description.provenance | Made available in DSpace on 2021-06-12T18:35:57Z (GMT). No. of bitstreams: 1 ntu-96-R94725023-1.pdf: 580130 bytes, checksum: 0e3cde05348ff92d98e3adc4cea8f06b (MD5) Previous issue date: 2007 | en |
dc.description.tableofcontents | 目錄
謝詞 II 論文摘要 III THESIS ABSTRACT IV 目錄 V 表次 VII 圖次 VIII 第一章 緒論 1 第一節 研究背景 1 1.1.1 網路論壇的特色與重要性 1 1.1.2 主題偵測與追蹤(Topic Detection and Tracking)技術 2 第二節 研究動機與目的 4 第三節 論文架構 6 第二章 文獻探討 7 第一節 主題偵測與追蹤(TDT) 7 2.1.1 事件與主題的定義 7 2.1.2 主題偵測與追蹤相關的研究領域 8 2.1.3 傳統新事件偵測技術 9 2.1.4 傳統新事件偵測技術的改良 14 2.1.5 傳統新事件偵測技術應用於網路論壇環境之困難 18 第二節 網路論壇探勘(WEB FORUM MINING) 19 第三節 聊天室記錄分析(CHATTING LOG ANALYSIS) 22 第三章 問題與研究方法 24 第一節 基本術語定義 24 第二節 問題定義 25 3.2.1 問題一:如何比對短訊息 27 3.2.2 問題二:如何由各訊息表現討論串隱含的概念 28 第三節 研究方法與架構 29 第四章 系統架構 31 第一節 系統概觀 31 4.1.1 系統功能 31 4.1.2 系統架構 31 4.1.3 系統前置作業 32 第二節 網路論壇訊息前處理 33 4.2.1 系統基本模型 34 第三節 計算訊息分群表示 36 4.3.1 以分群中心點表示討論串分群 37 4.3.2 以單一訊息表示討論串分群 38 4.3.3 分群中心點表示法的延伸 39 第四節 計算訊息與討論串分群相似度 40 第五章 實驗結果 42 第一節 測試資料 42 第二節 評估數據 45 第三節 實驗結果與分析 46 5.3.1 標記討論串標籤 46 5.3.2 實驗一:不同討論串分群表示法之比較 47 5.3.3 實驗二:單一訊息分群表示法之比較 49 5.3.4 實驗三:加入時間資訊於中心點表示法的效果 51 5.3.5 實驗四:加入時間資訊於單一訊息表示法的效果 52 第六章 結論與未來展望 56 第一節 結論 56 第二節 未來展望 56 參考文獻 58 | |
dc.language.iso | zh-TW | |
dc.title | 適用於網路論壇之新事件偵測技術之研究 | zh_TW |
dc.title | New Event Detection in the Context of Web Forums | en |
dc.type | Thesis | |
dc.date.schoolyear | 95-2 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 王柏堯(Bow-Yaw Wang) | |
dc.contributor.oralexamcommittee | 莊裕澤(Yuh-Jzer Joung),陳光華(Kuang-Hua Chen),王新民(Hsin-Min Wang) | |
dc.subject.keyword | 網路論壇,事件偵測與追蹤,新事件偵測,分群技術,網路社群, | zh_TW |
dc.subject.keyword | Web Forum,Topic Detection and Tracking,New Event Detection,Clustering,Web Community, | en |
dc.relation.page | 59 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2007-07-31 | |
dc.contributor.author-college | 管理學院 | zh_TW |
dc.contributor.author-dept | 資訊管理學研究所 | zh_TW |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-96-1.pdf 目前未授權公開取用 | 566.53 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。