查詢獨立與查詢相關排名機制之研究

Hsuan-Yu Lin; 林軒宇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28029

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳信希(Hsin-Hsi Chen)
dc.contributor.author	Hsuan-Yu Lin	en
dc.contributor.author	林軒宇	zh_TW
dc.date.accessioned	2021-06-12T18:34:25Z	-
dc.date.available	2012-08-18
dc.date.copyright	2011-08-18
dc.date.issued	2011
dc.date.submitted	2011-08-08
dc.identifier.citation	Balasubramanian, N. & Allan, J., 2010. Learning to select rankers. In Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval. pp. 855–856. Bian, J. et al., 2010. Ranking specialization for web search: a divide-and-conquer ap-proach by using topical RankSVM. In Proceedings of the 19th international conference on World wide web. pp. 131–140. Dwork, C. et al., 2001. Rank aggregation methods for the Web. In Proceedings of the 10th international conference on World Wide Web. pp. 613–622. Farah, M. & Vanderpooten, D., 2007. An outranking approach for rank aggregation in information retrieval. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 591–598. Freund, Y., Iyer, R., Schapire, R.E. & Singer, Y., 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research. pp.933–969. Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H. & Shum, H.-Y., 2008. Query depend-ent ranking using K-nearest neighbor. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. pp. 115–122. Joachims, T., 2002. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 133–142. Lee, J.H., 1997. Analyses of multiple evidence combination. In Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval. pp.267–276. Lillis, D. et al., 2006. ProbFuse: a probabilistic approach to data fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and develop-ment in information retrieval. pp. 139–146. Liu, T.-yan et al., 2007. Letor: Benchmark dataset for research on learning to rank for information retrieval. In In Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval. Marden, J.I., 1995. Analyzing and Modeling Rank Data, Chapman & Hall. Peng, J., Macdonald, C. & Ounis, I., 2010. Learning to Select a Ranking Function. In The 32nd European Conference on Information Retrieval. pp. 114–126. Renda, M.E. & Straccia, U., 2003. Web metasearch: rank vs. score based rank aggrega-tion methods. In Proceedings of the 2003 ACM symposium on Applied computing. pp. 841–846. Shaw, J.A. et al., 1994. Combination of Multiple Searches. In The Second Text REtrieval Conference (TREC-2). pp. 243–252. Tsai, M.-F., Liu, T.-Y., et al., 2007. FRank: a ranking method with fidelity loss. In Pro-ceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 383–390. Xia, F. et al., 2008. Listwise approach to learning to rank: theory and algorithm. In Proceedings of the 25th international conference on Machine learning. pp. 1192–1199. Xu, J. & Li, H., 2007. AdaRank: a boosting algorithm for information retrieval. In Pro-ceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 391–398.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28029	-
dc.description.abstract	現今搜尋排序學習 (Learning To Rank) 的技術已被廣泛使用於資訊檢索和網路搜尋系統中。傳統的排序模型常使用單一搜尋排序模型來對所有查詢字串的文件做排名，因為所有查詢字串都使用相同的排序模型，我們稱此為查詢獨立排序模型 (Query independent ranking model)；然而透過這種方式建立的排序模型並未充份利用查詢字串本身的特性，因此我們提出演算法，依據查詢字串的特性來調整排序模型，由此得到查詢相關排序模型 (query dependent ranking models)。我們提出單一的架構來研究此議題。架構中，首先使用分群演算法將訓練資料集切割成許多子集，再從子集中訓練出不同的子排序模型，我們稱為子排序模型為區域模型 (local model)；之後，我們提出四個以區域排序模型為基礎的查詢相關排序模型，使用這些排序模型來整合不同區域排序模型的排序結果，並決定最後的文件排名。我們的架構著重討論三大議題─查詢表示 (Query Representation)、分群演算法 (Data Clustering)和排名函式 (Ranking Function)，其中研究內容包含使用不同的查詢字串表示將查詢字串轉換成一個向量或表示的形式，兩種主要的分群演算法對訓練資料分群以學習區域排序模型，透過排名函式將區域排序模型的排名結果用不同的方式進行分數的合併。實驗數據說明我們的方法在效能上好於RankSVM與其他查詢獨立排序的方法，特別對於排序前幾筆文件的效能有顯著的提升。另外我們定義了查詢字串的 ”難度”，並分析實驗方法與RankSVM在不同查詢字串難度下的效能比較，在這部分的討論中發現我們方法用於較難或較特殊的查詢字串上效能特別好，也是整體效能優於RankSVM的主要原因。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-12T18:34:25Z (GMT). No. of bitstreams: 1 ntu-100-R98922137-1.pdf: 1798902 bytes, checksum: 7e04f6378b4b658d338bd3a2e474728f (MD5) Previous issue date: 2011	en
dc.description.tableofcontents	摘要 I 圖目錄 V 表目錄 VII 第一章序論 1 1.1 研究介紹 1 1.2 研究目標 3 第二章文獻探討 4 2.1 機器學習方法 4 2.1.1 (Joachims 2002) 5 2.2 排序結果合併 6 2.2.1 (Farah & Vanderpooten 2007) 7 2.3 查詢相關模型 8 2.3.1 (Geng et al. 2008) 8 2.3.2 (Peng et al. 2010) 10 2.3.3 (Bian et al. 2010) 11 第三章研究方法 12 3.1 三大議題 12 3.1.1 方法架構圖 13 3.1.2 查詢表示與分群表示 14 3.1.3 分群演算法與分群模型 14 3.1.4 排名函式 16 3.2 驗證可行性 17 3.2.1 實驗資料集 17 3.2.2 驗證方法 I: 相關文件向量 (RelevantOnly) 17 3.2.3 驗證方法 II: 效能上限 (UpperBound) 18 3.2.4 實驗結果與討論 18 3.3 方法設計 21 3.3.1 方法 I: 啟發式 (Heuristic) 21 3.3.2 方法 II: 解決方案空間 (SolutionSpace) 23 3.3.3 方法 III: 選擇模型 (SelectRanker) 24 3.3.4 方法 IV: 向量轉換 (Transform) 25 3.3.5 方法時間複雜度 27 第四章實驗結果與分析 29 4.1 LETOR 29 4.2 評估測量標準 31 4.3 方法實驗結果分析 32 4.3.1 實驗結果 I: 啟發式 33 4.3.2 實驗結果 II: 解決方案空間 41 4.3.3 實驗結果 III: 選擇模型 44 4.3.4 實驗結果 IV: 向量轉換 47 4.3.5 實驗數據統整與方法比較 50 4.4 查詢字串分析 55 4.4.1 查詢難度分類 55 4.4.2 不同階層查詢難度之方法效能 56 4.4.3 各查詢之方法比較 59 第五章結論與未來展望 65 5.1 結論 65 5.2 未來展望 65 參考文獻 66
dc.language.iso	zh-TW
dc.subject	區域排序模型	zh_TW
dc.subject	排序學習	zh_TW
dc.subject	效能評估	zh_TW
dc.subject	查詢相關排序	zh_TW
dc.subject	local models	en
dc.subject	learning to rank	en
dc.subject	query dependent ranking	en
dc.subject	evaluation	en
dc.title	查詢獨立與查詢相關排名機制之研究	zh_TW
dc.title	Ranking Documents by Query Independent and Query Dependent Models	en
dc.type	Thesis
dc.date.schoolyear	99-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	鄭卜壬(Pu-Jen Cheng),盧文祥(Wen-Hsiang Lu)
dc.subject.keyword	排序學習,區域排序模型,查詢相關排序,效能評估,	zh_TW
dc.subject.keyword	learning to rank,local models,query dependent ranking,evaluation,	en
dc.relation.page	67
dc.rights.note	有償授權
dc.date.accepted	2011-08-08
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-100-1.pdf 未授權公開取用	1.76 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。