Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38550
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor簡立峰(Lee-Feng Chien)
dc.contributor.authorChing-Hsiang Tsaien
dc.contributor.author蔡景祥zh_TW
dc.date.accessioned2021-06-13T16:37:03Z-
dc.date.available2005-07-12
dc.date.copyright2005-07-12
dc.date.issued2005
dc.date.submitted2005-07-06
dc.identifier.citation[1] Einat Amitay. “Using Common Hypertext Links to identify the Best Phrasal Description of Target web Document”, In Proceedings of the SIGIR'98 Post-Conference Workshop on Hypertext Information Retrieval for the Web, 1998.
[2] Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke and Sriram Raghavan, “Searching the Web”, In ACM Transactions on Internet Technology, Vol. 1, No. 1, August 2001, Pages 2-43.
[3] Ricardo Baeza-Yates and Berthier Ribeiro-Neto, “Modern Information Retrieval”, ACM Press and Addison Wesley, New York, 1999.
[4] Doug Beeferman and Adam Berger, “Agglomerative Clustering of a Search Engine Query Log”, In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000, Pages 407-416.
[5] Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse and Geoffrey Zweig, “Syntactic Clustering of theWeb”, Selected papers from the sixth international conference on World Wide Web, 1997, Pages 1157-1166.
[6] Clusty, (2005) http://clusty.com
[7] Douglass R. Cutting, David R. Karger, Jan O. Pedersen and John W. Tukey, “Scatter/gather: A Cluster-based Approach to Browsing Large Document Collections”, In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, 1992, Pages 318-329.
[8] B. D. Davison. “Topical Locality in the Web”, In Proceedings of the 23rd Annual International Conference on Research and Development in Information Retrieval, 2000, Pages 272-279.
[9] Devanshu Dhyani, Wee Keong Ng and Sourav S. Bhowmick, “A Survey of Web Metrics”, In ACM Computing Surveys (CSUR) Vol. 34, No. 4, 2003, Pages 469-503.
[10] William B. Frakes and Ricardo Baeza-Yates, “Information Retrieval - Data Structures & Algorithms”, Prentice Hall PTR, New Jersey, 1992.
[11] Google search engine, (2005) http://www.google.com
[12] Chien-Chung Huang, Shui-Lung Chuang and Lee-Feng Chien. “LiveClassifier: Creating Hierarchical Text Classifiers through Web Corpora”. In Proceedings of the 13th international conference on World Wide Web.2004.Pages 184-192.
[13] A. K. Jain, M. N. Murty and P. J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, Vol. 31, No. 3, September 1999.
[14] Z. Jiang, A. Joshi, R. Krishnapuram and L. Yi, “Retriever: Improving Web Search Engine Results Using Clustering”, Technical Report, CSEE Department, UMBC, 2000.
[15] Raymond Kosala and Hendrik Blockeel, “Web Mining Research: A Survey”, In ACM SIGKDD Explorations Newsletter Vol. 2, No. 1, 2000, Pages 1-15.
[16] Krishna Kummamuru, Rohit Lotlikar, Shourya Roy, Karan Singal and Raghu Krishnapuram, “A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results”, In Proceedings of the 13th international conference on World Wide Web, 2004, Pages 658-665.
[17] Anton V. Leouski and W. Bruce Croft., “An Evaluation of Techniques for Clustering Search Results”, In Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst, 1996.
[18] David D. Lewis and W. Bruce Croft, “Term Clustering of Syntactic Phrases”, In Proceedings of the Thirteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1990, Pages 385-404.
[19] Stanley Loh, Leandro Krug Wives and José Palazzo M. de Oliveira, “Concept-Based Knowledge discovery in Texts Extracted from the Web”, In ACM SIGKDD Explorations Newsletter Vol. 2, No. 1, 2000, Pages 29-39.
[20] Ramesh Nallapati, Ao Feng, Fuchun Peng and James Allan. “Event Threading within News Topics”, In Proceedings of the Thirteenth ACM conference on Information and knowledge management. 2004. Pages 446-453.
[21] C. Namprempre, P. Szilagyi, A. Duda and D. K. Giord, “Hypursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering”, In Proceedings of the the seventh ACM conference on Hypertext, 1996,
[22] Open Director Project, ODP, (2005) http://dmoz.org
[23] James Pitkow and Peter Pirolli, “Life, Death and Lawfulness on the Electronic Frontier”, In Proceedings of the SIGCHI conference on Human factors in computing systems, 1997, Pages 383-390.
[24] M. Sanderson and W. B. Croft, “Deriving Concept Hierarchies from Text”, In Proceedings of the 22st annual international ACM SIGIR conference on Research and development in information retrieval, 1999, Pages 206-213.
[25] Noam Slonim and Naftali Tishby, “Document Clustering using Word Clusters via the Information Bottleneck Method”, In Research and Development in Information Retrieval, 2000, Pages 208-215.
[26] Michael Steinbach, George Karypis and Vipin Kumar, “A Comparison of Document Clustering Techniques”, In KDD Workshop on Text Mining, 2000.
[27] Vivisimo clustering engine, (2005) http://vivisimo.com
[28] Yitong Wang and Masaru Kitsuregawa. “Evaluating Contents-Link Coupled Web Page Clustering for Web Search Results”, In Proceedings of the eleventh international Conference on Information and Knowledge Management, 2002, Pages 499-506.
[29] Ji-Rong Wen, Jian-Yun Nie and Hong-Jiang Zhang, “Query Clustering Using User Logs”, In ACM Transactions on Information Systems, Vol. 22, No. 1, 2002, Pages 59-81.
[30] P. Willett, “Recent Trends in Hierarchic Document Clustering: A Critical Review”, In Information Processing and Management: an International Journal, Vol. 24, No. 5, 1988, Pages 577-597.
[31] Jinxi Xu, W. Bruce Croft. “Improving the Effectiveness of Information Retrieval with Local Context Analysis”. In ACM Transactions on Information Systems, Volume 18 Issue 1. 2000. Pages 79-112.
[32] Yahoo search engine, (2005) http://www.yahoo.com
[33] Oren Zamir and Oren Etzioni, “Grouper: A Dynamic Clustering Interface to Web Search Results”, In Proceedings of the Eighth International World Wide Web Conference, 1999.
[34] Hua-Jun Zeng, Qi-Cai He, Zheng Chen, Wei-Ying Ma and Jinwen Ma, “Learning to Cluster Web Search Results”, In Proceedings of the 27th annual international conference on Research and development in information retrieval, 2004, Pages 210-217.
[35] Ying Zhao and George Karypis, “Evaluation of Hierarchical Clustering Algorithms for Document Datasets”, In Proceedings of the eleventh international conference on Information and knowledge management, 2002, Pages 515-524.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38550-
dc.description.abstract藉由搜尋引擎的幫助,人們已可輕易地從網路上獲得大量的資訊,然而這樣的搜尋結果經常缺乏適當的組織,使用者常須費心逐一檢視搜尋結果以找尋相關的資訊。過去的研究嘗試利用文件分群技術解決此問題,然而分群之後的結果往往缺乏語意上的解釋。
本論文提出一個新的搜尋結果組織技術,首先從搜尋結果摘錄中發掘出重要的主題術語,重要主題術語可以幫助使用者快速瀏覽整個搜尋結果中的重要概念,接著以使用者自訂的主題類別重新組織搜尋結果。在過去的搜尋結果組織技術中,都沒有考慮到使用者的偏好問題,本研究提出的作法不但可以以使用者比較熟悉的主題類別去標記所有出現在搜尋結果中的重要主題,幫助理解,還可以藉此瞭解重要主題之間的關係。這個技術可以提供使用者對於搜尋結果中重要主題的有較佳的整體概念,並且可以針對不同使用者的不同偏好加以重新組織,以個別使用者偏好的方式呈現。從初步實驗所獲致的結果發現,本研究所提方法可協助使用者快速瀏覽搜尋結果與重新闡述他的查詢。
zh_TW
dc.description.abstractWith the rapid growth of the amount of Web pages and the number of users, the demand for powerful search engines is high. Existing commercial search engines are still fraught with some disadvantages. The ranked list of search result pages returned from a search engine is often long and mixed with some concepts that are relevant to the user’s query but hard to be identified. Web search results are actually lack of a well organization, which require users to pay attention on examining the retrieved pages and identifying the correct ones. Conventional research to dealing with this problem relied on using document/term clustering algorithms to handle search results. However, the clustered results are still lack of comprehensive explanations.
In this thesis, we develop a new search result organizing approach which contains some characteristics outperformed from the conventional approaches. First, it extracts important topic terms from search result pages and tries to provide a comprehensive overview for the search result. Second, the extracted topic terms are organized with the manner the user prefers. In fact, users’ preferences were seldom taken into account in previous research. With the proposed approach, it is able to extract important topic terms from Web search result snippets and organizes them with the topic classes defined by users. A series of experiments has been conducted and the obtained results show that the proposed approach can help users effectively browse the concepts embedded in the search result pages and easier to locate relevant pages.
en
dc.description.provenanceMade available in DSpace on 2021-06-13T16:37:03Z (GMT). No. of bitstreams: 1
ntu-94-R92725028-1.pdf: 1357488 bytes, checksum: 2535457db6f809ac6b4ceded14f88655 (MD5)
Previous issue date: 2005
en
dc.description.tableofcontents謝詞 一
論文摘要 二
THESIS ABSTRACT 三
目錄 四
表次 六
圖次 七
第一章 序論 1
第一節 研究背景 1
1.1.1 全球資訊網上的搜尋服務 1
1.1.2 文件分群技術 2
第二節 研究動機與目的 3
第三節 論文架構 3
第二章 文獻探討 5
第一節 傳統分群演算法 5
2.1.1 階層式分群演算法 7
2.1.2 分割式分群演算法 8
第二節 向量空間模型與文件分群 9
第三節 網路搜尋結果分群相關研究探討 11
2.3.1 以術語為基礎的分群技術 13
2.3.2 以超連結為基礎的分群技術 15
2.3.3 結合術語與超連結的分群技術 18
2.3.4 文件分群技術應用於搜尋結果再組織之缺點 18
第三章 問題與研究方法 20
第一節 問題定義 20
3.1.1 問題一:發掘重要主題與摘要群聚 20
3.1.2 問題二:重新組織搜尋結果 21
第二節 研究架構與方法 22
第三節 系統實作 23
3.3.1 系統功能 24
3.3.2 系統架構 24
3.3.3 系統前置作業 26
第四節 系統特性 26
第五節 成效評估與使用者研究 27
第四章 主題發掘與搜尋結果群聚 29
第一節 主題擷取 30
第二節 主題選取 30
第三節 主題集合形成 33
第四節 實作細節 35
第五節 效能衡量 38
4.5.1 實驗1-1:主題發掘技術的效能 38
4.5.2 實驗1-2:不同主題發掘技術之間的重複情況 44
4.5.3 實驗1-3:主題發掘技術對於搜尋結果摘要的覆蓋率 46
第五章 依自訂主題組織搜尋結果 47
第一節 自動訓練使用者自訂主題類別分類器 49
第二節 自動分類主題 51
第三節 實作細節 54
第四節 效能衡量 55
5.4.1 實驗2-1:主題分類技術的效能 56
5.4.2 實驗2-2:間接相關程度對於主題分類效能的影響 59
5.4.3 實驗2-3:主題類別的熵值 61
5.4.4 實驗2-4:間接相關程度對於主題類別熵值的影響 63
5.4.5 實驗2-5:間接相關程度對主題分類效能與主題類別熵值的影響 64
第六章 結論與未來展望 67
第一節 結論 67
第二節 未來展望 67
參考文獻 69
附錄 73
簡歷 77
dc.language.isozh-TW
dc.subject分類技術zh_TW
dc.subject全球資訊網zh_TW
dc.subject搜尋結果組織zh_TW
dc.subject分群技術zh_TW
dc.subjectClusteringen
dc.subjectClassificationen
dc.subjectWorld Wide Weben
dc.subjectSearch Result Organizingen
dc.title網路搜尋結果自動組織之研究zh_TW
dc.titleA Study on Organizing Web Search Resultsen
dc.typeThesis
dc.date.schoolyear93-2
dc.description.degree碩士
dc.contributor.oralexamcommittee莊裕澤(Yuh-Jzer Joung),陳信希(Hsin-Hsi Chen),陳光華(Kuang-Hua Chen)
dc.subject.keyword全球資訊網,搜尋結果組織,分群技術,分類技術,zh_TW
dc.subject.keywordWorld Wide Web,Search Result Organizing,Clustering,Classification,en
dc.relation.page77
dc.rights.note有償授權
dc.date.accepted2005-07-06
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-94-1.pdf
  未授權公開取用
1.33 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved