社群網路生成及最大團問題於MapReduce有效率的演算法

Yi-Chen Lo; 羅亦辰

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22365

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	林守德(Shou-De Lin)
dc.contributor.author	Yi-Chen Lo	en
dc.contributor.author	羅亦辰	zh_TW
dc.date.accessioned	2021-06-08T04:16:17Z	-
dc.date.copyright	2011-08-23
dc.date.issued	2011
dc.date.submitted	2011-08-18
dc.identifier.citation	[1] J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters,” OSDI, 2004. [2] “Hadoop information,” http://hadoop.apache.org/ [3] P. Erdo˝s and A. Re´nyi, Bull. Inst. Int. Stat. 38, 343 (1961). [4] D. Watts and S. Strogatz (1998): Collective dynamics of small-world networks. Nature, 363:202–204. [5] R. Albert; A.-L. Barabási (2002). “Statistical mechanics of complex networks”. Reviews of Modern Physics 74: 47–97. [6] Bron, Coen; Kerbosch, Joep (1973), “Algorithm 457: finding all cliques of an undirected graph”, Commun. ACM (ACM) 16 (9): 575–577 [7] U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A peta-scale graph mining system. Data Mining, IEEE International Conference on, 0:229–238, 2009. [8] Hung-Che Lai, Exploiting and Evaluating MapReduce for Large-scale Graph Mining [9] Hadoop2010: XXL Graph Algorithms, Sergei Vassilvitskii, Yahoo! Labs. [10] Jonathan Cohen, Graph Twiddling in a MapReduce World. Comp. in Science & Engineering, July/August 2009, 29-41. [11] Bin Wu, Shengqi Yang, Haizhou Zhao, Bai Wang, “A Distributed Algorithm to Enumerate All Maximal Cliques in MapReduce,” fcst, pp.45-51, 2009 Fourth International Conference on Frontier of Computer Science and Technology, 2009 [12] The Parallel Boost Graph Library, http://osl.iu.edu/research/pbgl/ [13] Lin Peng, Wang Zebing, Guo Ming, “A maximum clique algorithm based on MapReduce”, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE) [14] Matthew C. Schmidt, Nagiza F. Samatova, Kevin Thomas, and Byung-Hoon Park. “A scalable, parallel algorithm for maximal clique enumeration”. Journal of Parallel and Distributed Computing, 69(4):417–428, 2009. [15] Andy Yoo, Keith W. Henderson. Parallel Generation of Massive Scale-Free Graphs. CoRR, 2010. [16] Cloud9: A MapReduce Library for Hadoop, http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/content/patterns.html [17] Jimmy Lin and Chris Dyer, “Data-Intensive Text Processing with MapReduce” [18] Snir, Marc; Otto, Steve; Huss-Lederman, Steven; Walker, David; Dongarra, Jack (1995) MPI: The Complete Reference [19] Compute Unified Device Architecture Programming Guide, C Nvidia - NVIDIA: Santa Clara, CA, 2007 [20] D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: Arecursive model for graph mining. In SDM, 2004.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22365	-
dc.description.abstract	社群網路生成在社群網路分析領域是一個重要的問題。人工產生的隨機圖被用來模擬真實世界社群網路已經行之有年。隨著真實世界社群網路如Twitter, Facebook等快速膨脹，我們需要可以產生大規模隨機圖的方法。Google提出的MapReduce[1]運算架構提供了我們將社群網路生成演算法平行化的需要及儲存大型網路檔案的分散式檔案系統。這篇論文提出將BA model[5]當中preferential attachment機制用到的degree由實際值代換成期望值，藉此讓BA model的生成過程可以被獨立地分割利於平行化。我們以MapReduce做實驗的結果顯示我們設計的平行演算法有良好的可擴放性並且產生出來的圖保有BA model原本的特性如power-law degree分布。這篇論文也將由Sergei Vassilvitskii在XXL Graph Algorithms[9]提出數三角形問題中把高degree的點分開處理的想法應用到最大團問題的MapReduce演算法。實驗的結果顯示這個方法讓計算效率有明顯的改進。	zh_TW
dc.description.abstract	Social network generation is an important problem in social network analysis. Artificially generated random graphs are used to model real world social networks for a long time. With the expanding size of real world social networks such as Twitter, Facebook, we need some methods to generate large scale random graphs. MapReduce[1] proposed by Google is the right framework to provide parallelism on social network generation algorithm and distributed file system to store large output graph data. In our paper we propose a method to replace the exact degree, which is used as the attachment preference in original BA model[5], with the expected value of degree. By this substitution BA model generation algorithm can be divided into several independent tasks to fit in parallel computing. And our experiment results on MapReduce shows that after this substitution, graphs generated by BA model keep their original properties such as power-law degree distribution with good scalability. In this paper we also apply the idea of differentiate high degree vertices in triangle counting algorithm proposed in XXL Graph Algorithms by Sergei Vassilvitskii[9], to finding maximal clique algorithm in MapReduce. As a result of our experiment, the computation efficiency is obviously improved.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T04:16:17Z (GMT). No. of bitstreams: 1 ntu-100-R98922012-1.pdf: 514025 bytes, checksum: 2ba6e07b4f41b86dd25471a703b33aa9 (MD5) Previous issue date: 2011	en
dc.description.tableofcontents	誌謝 ...................................................................................................................................i 中文摘要 .......................................................................................................................... ii ABSTRACT .................................................................................................................... iii CONTENTS .....................................................................................................................iv LIST OF FIGURES ..........................................................................................................vi Chapter 1 Introduction ................................................................................................ 1 1.1 Introduction to social network generation ............................................ 1 1.2 Introduction to finding maximal clique ................................................. 2 1.3 Thesis organization .................................................................................. 3 Chapter 2 Related Work .............................................................................................. 4 Chapter 3 Backgrounds .............................................................................................. 6 3.1 MapReduce and Hadoop ......................................................................... 6 3.2 Social Network Generation Models ........................................................ 7 3.2.1 Erdős–Rényi model (ER model) ..................................................................... 7 3.2.2 Watts and Strogatz model (WS model) ........................................................... 7 3.2.3 Barabási–Albert model(BA model) ................................................................ 8 Chapter 4 Methodology ............................................................................................ 10 4.1 Social network generation ..................................................................... 10 4.1.1 ER and WS model Generation in MapReduce ............................................. 10 4.1.2 Generating BA model in MapReduce ........................................................... 11 4.2 Finding Maximal Clique ....................................................................... 20 4.2.1 Counting Triangle ......................................................................................... 20 4.2.2 Finding maximal clique ................................................................................ 21 Chapter 5 Experiment ............................................................................................... 23 5.1 BA model generation.............................................................................. 23 5.1.1 Testing the property of power-law degree distribution of BA model generation ..................................................................................................... 23 5.1.2 Testing scalability of BA model generation .................................................. 24 5.2 Experiment on finding maximal clique algorithm .............................. 27 Chapter 6 Conclusion ............................................................................................... 29 REFERENCE .................................................................................................................. 30
dc.language.iso	en
dc.subject	最大團問題	zh_TW
dc.subject	MapReduce	zh_TW
dc.subject	平行運算	zh_TW
dc.subject	圖探勘	zh_TW
dc.subject	BA模型	zh_TW
dc.subject	maximal clique problem	en
dc.subject	parallel computing	en
dc.subject	MapReduce	en
dc.subject	graph mining	en
dc.subject	BA model	en
dc.title	社群網路生成及最大團問題於MapReduce有效率的演算法	zh_TW
dc.title	Efficient MapReduce-based Algorithm for Social Network Generation and Finding Maximal Clique	en
dc.type	Thesis
dc.date.schoolyear	99-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	劉邦鋒(Pang-Feng Liu),洪士灝(Shih-Hao Hung),蘇雅韻(Ya-Yunn Su)
dc.subject.keyword	MapReduce,平行運算,圖探勘,BA模型,最大團問題,	zh_TW
dc.subject.keyword	MapReduce,parallel computing,graph mining,BA model,maximal clique problem,	en
dc.relation.page	28
dc.rights.note	未授權
dc.date.accepted	2011-08-18
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-100-1.pdf Restricted Access	501.98 kB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets