以非監督式方法利用知識庫與搜尋結果提升網頁搜尋排序一致性

Jian-De Jiang; 江建德

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50302

標題:	以非監督式方法利用知識庫與搜尋結果提升網頁搜尋排序一致性 An Unsupervised Ranking Consistency Approach based on Knowledge Base and Search Results
作者:	Jian-De Jiang 江建德
指導教授:	鄭卜壬(Pu-Jen Cheng)
關鍵字:	網頁搜尋,排序一致性,查詢意圖,非監督式方法,知識庫,主題分群,查詢意圖模板, Web Search,Ranking Consistency,Query Intent,Unsupervised Approach,Knowledge Base,Topical Cluster,Query Intent Template,
出版年 :	2016
學位:	碩士
摘要:	對於網頁搜尋系統如知名搜尋引擎Google, Yahoo!與Bing，相關性排序是一個最重要的問題。相關性排序的傳統方法採用對於查詢分別進行最佳化的方式來增進效能。之前曾有一篇論文提出一個根據查詢意圖的相似性使用兩階段監督式學習，並藉由提升排序一致性來改善相關性排序。然而在該篇論文中有兩個問題需要被提出來解決。第一，他們使用學習排序需要使用大量的查詢紀錄，而如此大量的查詢紀錄只有成熟的搜尋引擎才會擁有，剛開始發展或發展中的搜尋系統必須仰賴非監督式方法來提升相關性排序。第二，該篇論文使用知識庫中的實體來代表查詢意圖。但由於查詢通常含有一些特定的資訊，所以實體並無法完全的表達查詢意圖。舉例來說:``Kobe Bryant family'表達的意圖是想了解Kobe Bryant的家人而非Kobe Bryant本人。在這篇論文當中，我們提出一個藉由搜尋結果與知識庫的兩階段非監督式方法來改善排序一致性與相關性排序，解決不成熟的搜尋系統沒有查詢紀錄的問題。第一階段從搜尋結果擷取排序一致性的分數，並於第二階段藉由衡量獨特性與一致性的方式重新排序搜尋結果。此外，我們在查詢意圖加入查詢模板可以讓我們更清楚的解析查詢意圖。就我們所知，我們的論文是第一個使用非監督式排序一致性方法來改善相關性排序。最後，我們使用Freebase與Yahoo!的搜尋結果當作實驗資料庫並證實我們的方法，結果顯示出我們成功藉由非監督式方法改善了排序一致性與相關性排序的效能。 Relevance ranking is the most important problem in web search system, such as Google, Yahoo!, Bing etc. Most of conventional approaches focus on optimizing ranking model by each query separately. One past work propose a two-stage supervised approach to improve relevance ranking by enhancing ranking consistency across queries with similar search intents. However, there are two crucial problems of previous work. First, they use pair-wise learning to rank to learn consistency, and the method relies on large-scale query log which only few of mature web search systems have. Most of developing search engines need to improve their performance without query log. Second, they considers query intents on entities in knowledge base. Nevertheless, entities cannot completely represent query intents because queries contains some specific information to ask, such as ``Kobe Bryant family' for the intents of family. In this work, we propose an two-phase unsupervised approach to improve ranking consistency by knowledge base and search results. The first phase extracts consistency from search results and the second phase re-ranks search results by leveraging consistency and unique. Furthermore, we add query templates to help us clarify query intents completely. For the best of our knowledge, our work is the first unsupervised method with ranking consistency to improve relevance ranking. We conducted extensive experiments using Freebase and search results from Yahoo! search engine, and results demonstrate that our approach improves ranking consistency and relevance ranking significantly.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50302
DOI:	10.6342/NTU201600837
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 目前未授權公開取用	3.43 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。