Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46983
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林軒田
dc.contributor.authorKen-Yi Linen
dc.contributor.author林庚毅zh_TW
dc.date.accessioned2021-06-15T05:44:37Z-
dc.date.available2011-08-20
dc.date.copyright2010-08-20
dc.date.issued2010
dc.date.submitted2010-08-19
dc.identifier.citation[1] K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in
enterprise corpora. pages 43–50. ACM Press, 2006.
[2] C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender.
Learning to rank using gradient descent. In Proceedings of the 22nd Annual
International Conference on Machine Learning, pages 89–96. ACM Press, 2005.
[3] C. J. C. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost
functions. In Advances in Neural Information Processing Systems 19, pages 193–
200. MIT Press, 2006.
[4] J. P. Callan. Passage-level evidence in document retrieval. In Proceedings of the
17th Annual International ACM SIGIR Conference on Research and Development
in Information Retrieval, pages 302–310. Springer-Verlag New York, Inc., 1994.
[5] Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, and H.-W. Hon. Adapting Ranking SVM
to document retrieval. In Proceedings of the 29th Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval, pages 186–193.
ACM Press, 2006.
[6] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise
approach to listwise approach. In Proceedings of the 24th Annual International
Conference on Machine Learning, pages 129–136. ACM Press, 2007.
[7] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001.
Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.
37
[8] O. Chapelle and S. S. Keerthi. Efficient algorithms for ranking with SVMs. Information
Retrieval, 13(3):201–215, 2010.
[9] C. Cortes, M. Mohri, and A. Rastogi. Magnitude-preserving ranking algorithms.
In Proceedings of the 24th Annual International Conference on Machine Learning,
pages 169–176. Omnipress, 2007.
[10] D. Cossock and T. Zhang. Subset ranking using regression. In Proceedings of the
19th Annual Conference on Learning Theory, pages 605–619. Springer, 2006.
[11] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R.Wang, and C.-J. Lin. LIBLINEAR: a library
for large linear classification. Journal of Machine Learning Research, 9:1871–
1874, 2008.
[12] R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using the second order
information for training SVM. Journal of Machine Learning Research, 6:1889–
1918, 2005.
[13] G. W. Flake and S. Lawrence. Efficient SVM regression training with SMO. Machine
Learning, 46(1-3):271–290, 2002.
[14] A. Frank and A. Asuncion. UCI machine learning repository, 2010.
[15] Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for
combining preferences. Journal of Machine Learning Research, 4:933–969, 2003.
[16] R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for
ordinal regression. In Advances in Large Margin Classifiers, pages 115–132. MIT
Press, 2000.
[17] J. Herlocker, J. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for
performing collaborative filtering. In Proceedings of the 22nd Annual International
ACM SIGIR Conference on Research and Development in Information Retrieval,
pages 230–237. ACM Press, 1999.
38
[18] T. Joachims. Making large-scale support vector machine learning practical. In Advances
in Kernel Methods: Support Vector Learning, pages 169–184. MIT Press,
1998.
[19] T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of
the 8th Annual International ACM SIGKDD Conference on Knowledge Discovery
and Data Mining, pages 133–142. ACM Press, 2002.
[20] L. Li, A. Pratap, H.-T. Lin, and Y. S. Abu-Mostafa. Improving generalization by data
categorization. In Knowledge Discovery in Databases: PKDD 2005, 9th European
Conference on Principles and Practice of Knowledge Discovery in Databases, pages
157–168. Springer, 2005.
[21] P. Li, C. J. C. Burges, and Q.Wu. McRank: Learning to rank using multiple classification
and gradient boosting. In Advances in Neural Information Processing Systems
18. MIT Press, 2007.
[22] T.-Y. Liu, J. Wang, W. Zhang, and H. Li. Listwise approach to learning to rank -
theory and algorithm. In Proceedings of the 25th Annual International Conference
on Machine Learning, pages 1192–1199. ACM Press, 2008.
[23] H.-Y. Lo, K.-W. Chang, S.-T. Chen, T.-H. Chiang, C.-S. Ferng, C.-J. Hsieh, Y.-K.
Ko, T.-T. Kuo, H.-C. Lai, K.-Y. Lin, C.-H. Wang, H.-F. Yu, C.-J. Lin, H.-T. Lin,
and S.-D. Lin. An ensemble of three classifiers for KDD Cup 2009: expanded linear
model, Heterogeneous Boosting, and Selective Naive Bayes. Proceedings of
KDD-Cup 2009 competition, vol. 7 of JMLR Workshop and Conference Proceedings,
pages 57–64, 2009.
[24] K. Pace and R. Johnson. StatLib-Datasets Archive, 2010.
[25] D. Parra and P. Brusilovsky. Collaborative filtering for social tagging systems: an
experiment with CiteULike. In Proceedings of the 3rd ACM conference on Recommender
systems, pages 237–240. ACM Press, 2009.
39
[26] J. C. Platt. Fast training of support vector machines using sequential minimal optimization.
In Advances in Kernel Methods: Support Vector Learning, pages 185–208.
MIT Press, 1998.
[27] V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
[28] T. Weixin and Z. Fuxi. Learning to rank using semantic features in document retrieval.
In Proceedings of the 2009 (1st) WRI Global Congress on Intelligent Systems,
pages 500–504. IEEE Computer Society, 2009.
[29] J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In Proceedings
of the 30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval, pages 391–398. ACM Press, 2007.
[30] D. Yimam. Expert finding systems for organizations: Domain analysis and the
demoir approach. In ECSCW 99 Beyond Knowledge Management: Management
Expertise Workshop, pages 276–283. MIT Press, 2000.
[31] Y. Yue and T. Finley. A support vector method for optimizing average precision. In
Proceedings of the 30th Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval, pages 271–278. ACM Press, 2007.
[32] T. Zhang. Solving large scale linear prediction problems using stochastic gradient
descent algorithms. In Proceedings of the 21th Annual International Conference on
Machine Learning, pages 919–926. Omnipress, 2004.
40
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46983-
dc.description.abstractLearning to rank has become a popular research topic in several areas including
information retrieval and machine learning. Pair-wise ranking, which
learns all the order preferences between every two examples, is one typical
method for solving the ranking problem. In pair-wise ranking, RankSVM is
a widely-used machine learning algorithm and has been successfully applied
to the ranking problem in the previous work. However, RankSVM suffers a
critical problem which is the long training time because of the huge number
of pairs.
In this thesis, we propose a data selection technique, Pruned RankSVM,
that selects the most informative pairs before training. If we use partial pairs
instead of total ones, we can train a large-scale data set efficiently. In the
experiment, we show the performance of Pruned RankSVM is overall comparable
with RankSVM while using significantly fewer pairs. To show the
efficiency of Pruned RankSVM, we also compare with one point-wise ranking
algorithm : support vector regression. Experimental results demonstrate
that Pruned RankSVM outperforms support vector regression on most data
sets.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T05:44:37Z (GMT). No. of bitstreams: 1
ntu-99-R97922117-1.pdf: 2963993 bytes, checksum: 5f8689b37c0493ed05b00db62b97016e (MD5)
Previous issue date: 2010
en
dc.description.tableofcontents致謝i
中文摘要iii
Abstract v
1 Introduction 1
1.1 Ranking Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Pair-wise Ranking 5
2.1 Existing Pair-wise Ranking Algorithms . . . . . . . . . . . . . . . . . . 5
2.1.1 RankSVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 RankLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Problem of Pair-wise Ranking Algorithms . . . . . . . . . . . . . . . . . 9
2.3 Why RankSVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Pruned RankSVM with Closest Pairs 11
3.1 Informative Pairs in RankSVM . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Support Vector Distribution . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Important Pairs during Optimization . . . . . . . . . . . . . . . . 15
3.2 Pruned RankSVM (closest) . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Experiment on Artificial Data . . . . . . . . . . . . . . . . . . . 18
3.2.2 Experiment on Real-world Data . . . . . . . . . . . . . . . . . . 20
3.3 Extended Approach with Numerical Information . . . . . . . . . . . . . 21
3.3.1 Order-based versus Label-based . . . . . . . . . . . . . . . . . . 22
3.3.2 Emphasis on Label-closest Pairs . . . . . . . . . . . . . . . . . . 23
3.3.3 Emphasis on Label-farthest Pairs . . . . . . . . . . . . . . . . . . 24
3.3.4 Comparison with Pruned RankSVM (closest) . . . . . . . . . . . 25
4 Pruned RankSVM with Mixed Pairs 29
4.1 Pruned RankSVM (mixed) . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.1 Pruned RankSVM (mixed) on Artificial Data . . . . . . . . . . . 30
4.1.2 Pruned RankSVM (mixed) on Real-world Data . . . . . . . . . . 31
4.2 Comparison between Closest and Mixed Technique . . . . . . . . . . . . 32
4.3 Coupling Pruned RankSVM (mixed) with Shrinking . . . . . . . . . . . 32
vii
5 Conclusion 35
Bibliography 37
viii
dc.language.isoen
dc.subject支持向量機排序法zh_TW
dc.subject排序學習zh_TW
dc.subject排序問題zh_TW
dc.subject資料選擇技術zh_TW
dc.subjectdata selectionen
dc.subjectRankSVMen
dc.subjectpair-wise rankingen
dc.subjectlearning to ranken
dc.title以資料選擇技術幫助大規模支持向量機排序zh_TW
dc.titleData selection techniques for large-scale RankSVMen
dc.typeThesis
dc.date.schoolyear98-2
dc.description.degree碩士
dc.contributor.oralexamcommittee徐宏民,李育杰,王鈺強
dc.subject.keyword排序問題,排序學習,支持向量機排序法,資料選擇技術,zh_TW
dc.subject.keywordlearning to rank,pair-wise ranking,RankSVM,data selection,en
dc.relation.page40
dc.rights.note有償授權
dc.date.accepted2010-08-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-99-1.pdf
  未授權公開取用
2.89 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved