Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61115
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉邦鋒(Pangfeng Liu)
dc.contributor.authorTsung-Han Lien
dc.contributor.author李宗翰zh_TW
dc.date.accessioned2021-06-16T10:47:22Z-
dc.date.available2016-08-16
dc.date.copyright2013-08-16
dc.date.issued2013
dc.date.submitted2013-08-12
dc.identifier.citation[1] Hbase. http://hbase.apache.org/.
[2] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large
clusters. In Proceedings of the 6th conference on Symposium on Opearting Systems
Design and Implementation.
[3] Leslie G. Valiant. A bridging model for parallel computation. Commun. ACM,
33(8):103–111, August 1990.
[4] Storm. http://storm-project.net/.
[5] Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan
Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph
processing. In Proceedings of the 2010 international conference on Management of
data.
[6] Hama. http://hama.apache.org/.
[7] Giraph. http://giraph.apache.org/.
[8] Wei Dong, Charikar Moses, and Kai Li. Efficient k-nearest neighbor graph con-
struction for generic similarity measures. In Proceedings of the 20th international
conference on World wide web.
[9] Kyle H. Ambert and Aaron M. Cohen. k-information gain scaled nearest neigh-
bors: A novel approach to classifying protein-protein interaction-related documents.
IEEE/ACM Trans. Comput. Biol. Bioinformatics.
[10] Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank
citation ranking: Bringing order to the web. Technical report, 1999.
[11] G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph parti-
tioning. In Supercomputing, 1998. SC98. IEEE/ACM Conference on.
[12] Li-Yung Ho, Jan-Jan Wu, and Pangfeng Liu. Distributed graph database for large-
scale social computing. In Cloud Computing (CLOUD), 2012 IEEE 5th International
Conference on, 2012.
[13] Wei Chen, Yifei Yuan, and Li Zhang. Scalable influence maximization in social net-
works under the linear threshold model. In Proceedings of the 2010 IEEE International
Conference on Data Mining.
[14] David Kempe, Jon Kleinberg, and
′
Eva Tardos. Maximizing the spread of influence
through a social network. In Proceedings of the ninth ACM SIGKDD international
conference on Knowledge discovery and data mining.
[15] Xiaohong Wang, Aaron Smalter, Jun Huan, and Gerald H. Lushington. G-hash:
towards fast kernel-based similarity search in large graph databases. In Proceedings
of the 12th International Conference on Extending Database Technology: Advances in
Database Technology, EDBT ’09, pages 472–480, New York, NY, USA, 2009. ACM.
[16] Luke Wang, Yanghua Xiao, and Haixun Wang. How to partition a billion-node graph.
Technical report, Microsoft Research Asia, 2013.
[17] Michele Coscia, Giulio Rossetti, Fosca Giannotti, and Dino Pedreschi. Demon: a
local-first discovery method for overlapping communities. In Proceedings of the 18th
ACM SIGKDD international conference on Knowledge discovery and data mining,
KDD ’12, pages 615–623, New York, NY, USA, 2012. ACM.
[18] Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, and
Bobby Bhattacharjee. Measurement and Analysis of Online Social Networks. In
Proceedings of the 5th ACM/Usenix Internet Measurement Conference (IMC’07).
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61115-
dc.description.abstractWe introduce Kylin, an efficient and scalable graph data processing system. Kylin is based
on Bulk Synchronous Parallel (BSP) model to process graph data. Although there have
been some BSP-based graph processing systems, Kylin is different from these systems
in two-fold. First, Kylin cooperates with HBase to achieve scalable data manipulation.
Second, We propose three techniques to optimize the performance of Kylin. The proposed
techniques are pull messaging, lazy vertex loading and vertex-weighted partitioning. We
demonstrate Kylin outperforms other BSP-based systems, i.e. Hama and Giraph, in the
experiments.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T10:47:22Z (GMT). No. of bitstreams: 1
ntu-102-R00922074-1.pdf: 536483 bytes, checksum: 4cc5c1aaade490bd6fb6f82fd7ca8ba0 (MD5)
Previous issue date: 2013
en
dc.description.tableofcontentsContents
Certification I
Acknowledgement II
Chinese Abstract III
Abstract IV
1 Introduction 1
2 Related Works 4
3 Architecture 6
3.1 Partition Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Query Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Data Graph and Data Loader . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Query Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Optimizations 9
4.1 Pull Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Lazy Vertex Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Vertex-weighted Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5 Implementation 14
5.1 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Communication and Synchronization . . . . . . . . . . . . . . . . . . . . . . 15
5.3 Vertex Program Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6 Benchmark Suite 17
6.1 Max Value Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 Single Source Shortest Path . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.3 Bipartite Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.4 Influence Spreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.5 PageRank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.6 N-steps neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.7 Label propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 Experiments 20
7.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.2 Pull model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.3 Lazy Vertex Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.4 Partitioning strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.5 Overall performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8 Conclusion 25
Bibliography 26
dc.language.isoen
dc.subject動態載入zh_TW
dc.subject圖資料切割zh_TW
dc.subject負載平衡zh_TW
dc.subject圖資料處理zh_TW
dc.subjectgraph data processingen
dc.subjectgraph data partitionen
dc.subjectload balancingen
dc.subjectpull messagingen
dc.subjectdynamic loadingen
dc.title具擴展性的高效能分散式圖資料處理系統zh_TW
dc.titleKylin: An Efficient and Scalable Graph Data Processing
System
en
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee吳真貞(Jan-Jan Wu),洪士灝(Shih-Hao Hung)
dc.subject.keyword圖資料處理,圖資料切割,負載平衡,動態載入,zh_TW
dc.subject.keywordgraph data processing,graph data partition,load balancing,pull messaging,dynamic loading,en
dc.relation.page27
dc.rights.note有償授權
dc.date.accepted2013-08-12
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
523.91 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved