Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51324
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林智仁(Chih-Jen Lin)
dc.contributor.authorWei-Lun Huangen
dc.contributor.author黃煒倫zh_TW
dc.date.accessioned2021-06-15T13:30:33Z-
dc.date.available2019-03-08
dc.date.copyright2016-03-08
dc.date.issued2016
dc.date.submitted2016-02-03
dc.identifier.citation[1] A. Agarwal, O. Chapelle, M. Dudik, and J. Langford. A reliable effective terascale linear learning system. Journal of Machine Learning Research, 15:1111-1133, 2014.
[2] A. Airola, T. Pahikkala, and T. Salakoski. Training linear ranking SVMs in linearithmic time using red-black trees. Pattern Recognition Letters, 32(9):1328-1336, 2011.
[3] B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144-152. ACM Press, 1992.
[4] C. J. C. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Technical Report MSR-TR-2010-82, Microsoft Research, 2010.
[5] Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, and C.-J. Lin. Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11:1471-1490, 2010. URL http://www.csie.ntu.edu.tw/~cjlin/papers/lowpoly_journal.pdf.
[6] O. Chapelle and Y. Chang. Yahoo! learning to rank challenge overview. In JMLR Workshop and Conference Proceedings: Workshop on Yahoo! Learning to Rank Challenge, volume 14, pages 1-24, 2011.
[7] O. Chapelle and S. S. Keerthi. Efficient algorithms for ranking with SVMs. Information Retrieval, 13(3):201-215, 2010.
[8] D. Christensen. Fast algorithms for the calculation of Kendall's tau. Computational Statistics, 20:51-62, 2005.
[9] C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273-297, 1995.
[10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: a library for large linear classi cation. Journal of Machine Learning Research, 9:1871-1874, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf.
[11] E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S.Woodall. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVM/MPI Users' Group Meeting, pages 97-104, 2004.
[12] R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In P. J. Bartlett, B. Sch olkopf, D. Schuurmans, and A. J. Smola, editors, Advances in Large Margin Classifiers, pages 115-132. MIT Press, 2000.
[13] J. Jin. Mail from Jing Jin (personal communication), 2015.
[14] J. Jin and X. Lin. Efficient parallel algorithms for linear ranksvm on GPU. In C. Hsu, X. Shi, and V. Salapura, editors, Network and Parallel Computing - 11th IFIP WG 10.3 International Conference, NPC 2014, Ilan, Taiwan, September 18-20, 2014. Proceedings, volume 8707 of Lecture Notes in Computer Science, pages 181-194. Springer, 2014. doi: 10.1007/978-3-662-44917-2 16. URL http://dx.doi.org/10.1007/978-3-662-44917-2_16.
[15] T. Joachims. A support vector method for multivariate performance measures. In Proceedings of the Twenty Second International Conference on Machine Learning (ICML), 2005.
[16] T. Joachims. Training linear SVMs in linear time. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.
[17] S. S. Keerthi and D. DeCoste. A modi ed nite newton method for fast solution of large scale linear svms. Journal of Machine Learning Research, 6:341-361, 2005. URL http://dblp.uni-trier.de/db/journals/jmlr/jmlr6.html#KeerthiD05.
[18] C.-P. Lee and C.-J. Lin. Large-scale linear rankSVM. Neural Computation, 26(4): 781-817, 2014. URL http://www.csie.ntu.edu.tw/~cjlin/papers/ranksvm/ranksvml2.pdf.
[19] C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for largescale logistic regression. In Proceedings of the 24th International Conference on Machine Learning (ICML), 2007. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear.
[20] C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for largescale logistic regression. Journal of Machine Learning Research, 9:627-650, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/logistic.pdf.
[21] C.-Y. Lin, C.-H. Tsai, C.-P. Lee, and C.-J. Lin. Large-scale logistic regression and linear support vector machines using Spark. In Proceedings of the IEEE International Conference on Big Data, pages 519-528, 2014. URL http://www.csie.ntu.edu.tw/~cjlin/papers/spark-liblinear/spark-liblinear.pdf.
[22] O. L. Mangasarian. A nite Newton method for classi cation. Optimization Methods and Software, 17(5):913-929, 2002.
[23] M. Snir and S. Otto. MPI-the complete reference: the MPI core. MIT Press, Cambridge, MA, USA, 1998.
[24] G.-X. Yuan, C.-H. Ho, and C.-J. Lin. Recent advances of large-scale linear classification. Proceedings of the IEEE, 100(9):2584-2603, 2012. URL http://www.csie.ntu.edu.tw/~cjlin/papers/survey-linear.pdf.
[25] Y. Zhuang, W.-S. Chin, Y.-C. Juan, and C.-J. Lin. Distributed Newton method for regularized logistic regression. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2015.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51324-
dc.description.abstract在排序學習中,要快速地得到一個基準模型作為比較,線性排序支持向量機是一個有用的方法。雖然它的平行機制已經被探討且實作在圖形處理器上面,但此實作有可能無法處理大規模的數據集。在本論文中,我們提出兩種平行架構,用分散式牛頓法訓練L2損失函數之線性排序支持向量機。我們小心的探討降低溝通成本以及加速運算的技術,並且在稠密和稀疏的數據集上比較兩種平行機制的優劣。實驗顯示本文提出的方法在兩種數據集上會遠比單機運算快,分別為資料量遠大於特徵數以及特徵數遠大於資料量的數據集。zh_TW
dc.description.abstractLinear rankSVM is a useful method to quickly produce a baseline model for learning to rank. Although its parallelization has been investigated and implemented on GPU, it may not handle large-scale data sets. In this thesis, we propose a distributed trust region Newton method for training L2-loss linear rankSVM with two kinds of parallelizations. We carefully discuss the techniques for reducing the communication cost and speeding up the computation, and compare both kinds of parallelizations on dense and sparse data sets. Experiments show that our distributed methods are much faster than the single machine method on two kinds of data sets: one with its number of instances much larger than its number of features, and the other is the opposite.en
dc.description.provenanceMade available in DSpace on 2021-06-15T13:30:33Z (GMT). No. of bitstreams: 1
ntu-105-R02922041-1.pdf: 6624881 bytes, checksum: bb7aa5ee5b7081159349dbe594623abd (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents口試委員會審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
CHAPTER
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
II. Efficient Computation of Linear RankSVM and Distributed
Trust Region Newton Methods . . . . . . . . . . . . . . . . . . . . 6
2.1 Trust Region Newton Methods (TRON) . . . . . . . . . . . . . 6
2.2 Efficient Evaluation for Function/Gradient and Hessian-vector
Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Query-wise Function/Gradient and Matrix-vector Products Evaluation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Query-wise and Feature-wise Data Splits . . . . . . . . . . . . . 14
2.4.1 Query-wise Split . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Feature-wise Split . . . . . . . . . . . . . . . . . . . . 16
2.4.3 Comparison between Query-wise and Feature-wise Splits 17
2.5 Implementation Techniques . . . . . . . . . . . . . . . . . . . . 18
III. Comparison with Related Works . . . . . . . . . . . . . . . . . . . 21
3.1 GPU Linear rankSVM . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 DRANKSVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Distributed TRON on Logistic Regression . . . . . . . . . . . . 23
3.3.1 Logistic Regression . . . . . . . . . . . . . . . . . . . 24
3.3.2 Instance-wise and Feature-wise Data Split . . . . . . . 24
3.3.3 Instance-wise Split . . . . . . . . . . . . . . . . . . . . 25
3.3.4 Feature-wise Split . . . . . . . . . . . . . . . . . . . . 25
3.3.5 Comparison Between Distributed Logistic Regression
and Linear RankSVM . . . . . . . . . . . . . . . . . . 26
IV. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Low-Degree Polynomial Expansion on Sparse Data Sets . . . . 30
4.2.1 Implementation Details . . . . . . . . . . . . . . . . . 31
4.2.2 Comparison Between Non-Expanded and Expanded
Sparse Data Sets on Training Time and Test Accuracy 32
4.3 Comparison Between TreeTron-qw-noacc and TreeTron-qw on
Training Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Comparison Between TreeTron-qw, TreeTron-fw, and TreeTron
on Function Values . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5 Comparison Between TreeTron-qw, TreeTron-fw, and TreeTron
on Test Pairwise Accuracy . . . . . . . . . . . . . . . . . . . . 38
4.6 Speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
V. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
dc.language.isoen
dc.subject大規模學習zh_TW
dc.subject排序支持向量機zh_TW
dc.subject分散式牛頓法zh_TW
dc.subject大規模學習zh_TW
dc.subject排序支持向量機zh_TW
dc.subject分散式牛頓法zh_TW
dc.subjectDistributed Newton methoden
dc.subjectLearning to ranken
dc.subjectRanking support vector machinesen
dc.subjectLarge-scale learningen
dc.subjectLinear modelen
dc.subjectDistributed Newton methoden
dc.subjectLearning to ranken
dc.subjectRanking support vector machinesen
dc.subjectLarge-scale learningen
dc.subjectLinear modelen
dc.title大規模線性排序支持向量機在分散式環境下之分析實作zh_TW
dc.titleAnalysis and Implementation of Large-scale Linear RankSVM in Distributed Environmentsen
dc.typeThesis
dc.date.schoolyear104-1
dc.description.degree碩士
dc.contributor.oralexamcommittee林軒田(Hsuan-Tien Lin),李育杰(Yuh-Jye Lee)
dc.subject.keyword大規模學習,排序支持向量機,分散式牛頓法,zh_TW
dc.subject.keywordLearning to rank,Ranking support vector machines,Large-scale learning,Linear model,Distributed Newton method,en
dc.relation.page48
dc.rights.note有償授權
dc.date.accepted2016-02-03
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
6.47 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved