網路搜尋結果自動組織之研究

Ching-Hsiang Tsai; 蔡景祥

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38550

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	簡立峰(Lee-Feng Chien)
dc.contributor.author	Ching-Hsiang Tsai	en
dc.contributor.author	蔡景祥	zh_TW
dc.date.accessioned	2021-06-13T16:37:03Z	-
dc.date.available	2005-07-12
dc.date.copyright	2005-07-12
dc.date.issued	2005
dc.date.submitted	2005-07-06
dc.identifier.citation	[1] Einat Amitay. “Using Common Hypertext Links to identify the Best Phrasal Description of Target web Document”, In Proceedings of the SIGIR'98 Post-Conference Workshop on Hypertext Information Retrieval for the Web, 1998. [2] Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke and Sriram Raghavan, “Searching the Web”, In ACM Transactions on Internet Technology, Vol. 1, No. 1, August 2001, Pages 2-43. [3] Ricardo Baeza-Yates and Berthier Ribeiro-Neto, “Modern Information Retrieval”, ACM Press and Addison Wesley, New York, 1999. [4] Doug Beeferman and Adam Berger, “Agglomerative Clustering of a Search Engine Query Log”, In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000, Pages 407-416. [5] Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse and Geoffrey Zweig, “Syntactic Clustering of theWeb”, Selected papers from the sixth international conference on World Wide Web, 1997, Pages 1157-1166. [6] Clusty, (2005) http://clusty.com [7] Douglass R. Cutting, David R. Karger, Jan O. Pedersen and John W. Tukey, “Scatter/gather: A Cluster-based Approach to Browsing Large Document Collections”, In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, 1992, Pages 318-329. [8] B. D. Davison. “Topical Locality in the Web”, In Proceedings of the 23rd Annual International Conference on Research and Development in Information Retrieval, 2000, Pages 272-279. [9] Devanshu Dhyani, Wee Keong Ng and Sourav S. Bhowmick, “A Survey of Web Metrics”, In ACM Computing Surveys (CSUR) Vol. 34, No. 4, 2003, Pages 469-503. [10] William B. Frakes and Ricardo Baeza-Yates, “Information Retrieval - Data Structures & Algorithms”, Prentice Hall PTR, New Jersey, 1992. [11] Google search engine, (2005) http://www.google.com [12] Chien-Chung Huang, Shui-Lung Chuang and Lee-Feng Chien. “LiveClassifier: Creating Hierarchical Text Classifiers through Web Corpora”. In Proceedings of the 13th international conference on World Wide Web.2004.Pages 184-192. [13] A. K. Jain, M. N. Murty and P. J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, Vol. 31, No. 3, September 1999. [14] Z. Jiang, A. Joshi, R. Krishnapuram and L. Yi, “Retriever: Improving Web Search Engine Results Using Clustering”, Technical Report, CSEE Department, UMBC, 2000. [15] Raymond Kosala and Hendrik Blockeel, “Web Mining Research: A Survey”, In ACM SIGKDD Explorations Newsletter Vol. 2, No. 1, 2000, Pages 1-15. [16] Krishna Kummamuru, Rohit Lotlikar, Shourya Roy, Karan Singal and Raghu Krishnapuram, “A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results”, In Proceedings of the 13th international conference on World Wide Web, 2004, Pages 658-665. [17] Anton V. Leouski and W. Bruce Croft., “An Evaluation of Techniques for Clustering Search Results”, In Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst, 1996. [18] David D. Lewis and W. Bruce Croft, “Term Clustering of Syntactic Phrases”, In Proceedings of the Thirteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1990, Pages 385-404. [19] Stanley Loh, Leandro Krug Wives and José Palazzo M. de Oliveira, “Concept-Based Knowledge discovery in Texts Extracted from the Web”, In ACM SIGKDD Explorations Newsletter Vol. 2, No. 1, 2000, Pages 29-39. [20] Ramesh Nallapati, Ao Feng, Fuchun Peng and James Allan. “Event Threading within News Topics”, In Proceedings of the Thirteenth ACM conference on Information and knowledge management. 2004. Pages 446-453. [21] C. Namprempre, P. Szilagyi, A. Duda and D. K. Giord, “Hypursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering”, In Proceedings of the the seventh ACM conference on Hypertext, 1996, [22] Open Director Project, ODP, (2005) http://dmoz.org [23] James Pitkow and Peter Pirolli, “Life, Death and Lawfulness on the Electronic Frontier”, In Proceedings of the SIGCHI conference on Human factors in computing systems, 1997, Pages 383-390. [24] M. Sanderson and W. B. Croft, “Deriving Concept Hierarchies from Text”, In Proceedings of the 22st annual international ACM SIGIR conference on Research and development in information retrieval, 1999, Pages 206-213. [25] Noam Slonim and Naftali Tishby, “Document Clustering using Word Clusters via the Information Bottleneck Method”, In Research and Development in Information Retrieval, 2000, Pages 208-215. [26] Michael Steinbach, George Karypis and Vipin Kumar, “A Comparison of Document Clustering Techniques”, In KDD Workshop on Text Mining, 2000. [27] Vivisimo clustering engine, (2005) http://vivisimo.com [28] Yitong Wang and Masaru Kitsuregawa. “Evaluating Contents-Link Coupled Web Page Clustering for Web Search Results”, In Proceedings of the eleventh international Conference on Information and Knowledge Management, 2002, Pages 499-506. [29] Ji-Rong Wen, Jian-Yun Nie and Hong-Jiang Zhang, “Query Clustering Using User Logs”, In ACM Transactions on Information Systems, Vol. 22, No. 1, 2002, Pages 59-81. [30] P. Willett, “Recent Trends in Hierarchic Document Clustering: A Critical Review”, In Information Processing and Management: an International Journal, Vol. 24, No. 5, 1988, Pages 577-597. [31] Jinxi Xu, W. Bruce Croft. “Improving the Effectiveness of Information Retrieval with Local Context Analysis”. In ACM Transactions on Information Systems, Volume 18 Issue 1. 2000. Pages 79-112. [32] Yahoo search engine, (2005) http://www.yahoo.com [33] Oren Zamir and Oren Etzioni, “Grouper: A Dynamic Clustering Interface to Web Search Results”, In Proceedings of the Eighth International World Wide Web Conference, 1999. [34] Hua-Jun Zeng, Qi-Cai He, Zheng Chen, Wei-Ying Ma and Jinwen Ma, “Learning to Cluster Web Search Results”, In Proceedings of the 27th annual international conference on Research and development in information retrieval, 2004, Pages 210-217. [35] Ying Zhao and George Karypis, “Evaluation of Hierarchical Clustering Algorithms for Document Datasets”, In Proceedings of the eleventh international conference on Information and knowledge management, 2002, Pages 515-524.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38550	-
dc.description.abstract	藉由搜尋引擎的幫助，人們已可輕易地從網路上獲得大量的資訊，然而這樣的搜尋結果經常缺乏適當的組織，使用者常須費心逐一檢視搜尋結果以找尋相關的資訊。過去的研究嘗試利用文件分群技術解決此問題，然而分群之後的結果往往缺乏語意上的解釋。本論文提出一個新的搜尋結果組織技術，首先從搜尋結果摘錄中發掘出重要的主題術語，重要主題術語可以幫助使用者快速瀏覽整個搜尋結果中的重要概念，接著以使用者自訂的主題類別重新組織搜尋結果。在過去的搜尋結果組織技術中，都沒有考慮到使用者的偏好問題，本研究提出的作法不但可以以使用者比較熟悉的主題類別去標記所有出現在搜尋結果中的重要主題，幫助理解，還可以藉此瞭解重要主題之間的關係。這個技術可以提供使用者對於搜尋結果中重要主題的有較佳的整體概念，並且可以針對不同使用者的不同偏好加以重新組織，以個別使用者偏好的方式呈現。從初步實驗所獲致的結果發現，本研究所提方法可協助使用者快速瀏覽搜尋結果與重新闡述他的查詢。	zh_TW
dc.description.abstract	With the rapid growth of the amount of Web pages and the number of users, the demand for powerful search engines is high. Existing commercial search engines are still fraught with some disadvantages. The ranked list of search result pages returned from a search engine is often long and mixed with some concepts that are relevant to the user’s query but hard to be identified. Web search results are actually lack of a well organization, which require users to pay attention on examining the retrieved pages and identifying the correct ones. Conventional research to dealing with this problem relied on using document/term clustering algorithms to handle search results. However, the clustered results are still lack of comprehensive explanations. In this thesis, we develop a new search result organizing approach which contains some characteristics outperformed from the conventional approaches. First, it extracts important topic terms from search result pages and tries to provide a comprehensive overview for the search result. Second, the extracted topic terms are organized with the manner the user prefers. In fact, users’ preferences were seldom taken into account in previous research. With the proposed approach, it is able to extract important topic terms from Web search result snippets and organizes them with the topic classes defined by users. A series of experiments has been conducted and the obtained results show that the proposed approach can help users effectively browse the concepts embedded in the search result pages and easier to locate relevant pages.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T16:37:03Z (GMT). No. of bitstreams: 1 ntu-94-R92725028-1.pdf: 1357488 bytes, checksum: 2535457db6f809ac6b4ceded14f88655 (MD5) Previous issue date: 2005	en
dc.description.tableofcontents	謝詞一論文摘要二 THESIS ABSTRACT 三目錄四表次六圖次七第一章序論 1 第一節研究背景 1 1.1.1 全球資訊網上的搜尋服務 1 1.1.2 文件分群技術 2 第二節研究動機與目的 3 第三節論文架構 3 第二章文獻探討 5 第一節傳統分群演算法 5 2.1.1 階層式分群演算法 7 2.1.2 分割式分群演算法 8 第二節向量空間模型與文件分群 9 第三節網路搜尋結果分群相關研究探討 11 2.3.1 以術語為基礎的分群技術 13 2.3.2 以超連結為基礎的分群技術 15 2.3.3 結合術語與超連結的分群技術 18 2.3.4 文件分群技術應用於搜尋結果再組織之缺點 18 第三章問題與研究方法 20 第一節問題定義 20 3.1.1 問題一：發掘重要主題與摘要群聚 20 3.1.2 問題二：重新組織搜尋結果 21 第二節研究架構與方法 22 第三節系統實作 23 3.3.1 系統功能 24 3.3.2 系統架構 24 3.3.3 系統前置作業 26 第四節系統特性 26 第五節成效評估與使用者研究 27 第四章主題發掘與搜尋結果群聚 29 第一節主題擷取 30 第二節主題選取 30 第三節主題集合形成 33 第四節實作細節 35 第五節效能衡量 38 4.5.1 實驗1-1：主題發掘技術的效能 38 4.5.2 實驗1-2：不同主題發掘技術之間的重複情況 44 4.5.3 實驗1-3：主題發掘技術對於搜尋結果摘要的覆蓋率 46 第五章依自訂主題組織搜尋結果 47 第一節自動訓練使用者自訂主題類別分類器 49 第二節自動分類主題 51 第三節實作細節 54 第四節效能衡量 55 5.4.1 實驗2-1：主題分類技術的效能 56 5.4.2 實驗2-2：間接相關程度對於主題分類效能的影響 59 5.4.3 實驗2-3：主題類別的熵值 61 5.4.4 實驗2-4：間接相關程度對於主題類別熵值的影響 63 5.4.5 實驗2-5：間接相關程度對主題分類效能與主題類別熵值的影響 64 第六章結論與未來展望 67 第一節結論 67 第二節未來展望 67 參考文獻 69 附錄 73 簡歷 77
dc.language.iso	zh-TW
dc.subject	分類技術	zh_TW
dc.subject	全球資訊網	zh_TW
dc.subject	搜尋結果組織	zh_TW
dc.subject	分群技術	zh_TW
dc.subject	Clustering	en
dc.subject	Classification	en
dc.subject	World Wide Web	en
dc.subject	Search Result Organizing	en
dc.title	網路搜尋結果自動組織之研究	zh_TW
dc.title	A Study on Organizing Web Search Results	en
dc.type	Thesis
dc.date.schoolyear	93-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	莊裕澤(Yuh-Jzer Joung),陳信希(Hsin-Hsi Chen),陳光華(Kuang-Hua Chen)
dc.subject.keyword	全球資訊網,搜尋結果組織,分群技術,分類技術,	zh_TW
dc.subject.keyword	World Wide Web,Search Result Organizing,Clustering,Classification,	en
dc.relation.page	77
dc.rights.note	有償授權
dc.date.accepted	2005-07-06
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-94-1.pdf 未授權公開取用	1.33 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。