Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61050
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor周承復
dc.contributor.authorWei-Hao Linen
dc.contributor.author林煒晧zh_TW
dc.date.accessioned2021-06-16T10:43:32Z-
dc.date.available2014-02-26
dc.date.copyright2014-02-26
dc.date.issued2013
dc.date.submitted2013-08-13
dc.identifier.citation[1] Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. A scalable, commodity
data center network architecture. In Proceedings of the ACM SIGCOMM
2008 conference on Data communication, SIGCOMM ’08, pages 63–74, New York,
NY, USA, 2008. ACM.
[2] Facebook reports first quarter 2013 results. http://investor.fb.com/releasedetail.
cfm?ReleaseID=761090.
[3] Celebrating #twitter7. https://blog.twitter.com/2013/celebrating-twitter7.
[4] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz,
Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica, and
Matei Zaharia. Above the clouds: A berkeley view of cloud computing. Technical
Report UCB/EECS-2009-28, EECS Department, University of California, Berkeley,
Feb 2009.
[5] M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified np-complete problems.
In Proceedings of the sixth annual ACM symposium on Theory of computing,
STOC ’74, pages 47–63, New York, NY, USA, 1974. ACM.
[6] Konstantin Andreev and Harald Racke. Balanced graph partitioning. In Proceedings
of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures,
SPAA ’04, pages 120–124, New York, NY, USA, 2004. ACM.
[7] FabrA-cio Benevenuto, Tiago Rodrigues, Meeyoung Cha, and VirgA-lio A. F.
Almeida. Characterizing user behavior in online social networks. In Anja Feldmann
and Laurent Mathy, editors, Proceedings of the 9th ACM SIGCOMM Conference
on Internet Measurement 2009, Chicago, Illinois, USA, November 4-6, 2009, pages
49–62. ACM, 2009.
[8] Fabian Schneider, Anja Feldmann, Balachander Krishnamurthy, and Walter Willinger.
Understanding online social network usage from a network perspective. In
Internet Measurement Conference, pages 35–48. ACM, 2009.
[9] Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang, Nikos Laoutaris,
Parminder Chhabra, and Pablo Rodriguez. The little engine(s) that could: scaling
online social networks. In Proceedings of the ACM SIGCOMM 2010 conference,
SIGCOMM ’10, pages 375–386, New York, NY, USA, 2010. ACM.
[10] Sabrina Gaito, Matteo Zignani, Gian Paolo Rossi, Alessandra Sala, Xiaohan Zhao,
Haitao Zheng, and Ben Y. Zhao. On the bursty evolution of online social networks.
In Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary
Social Networks Research, HotSocial ’12, pages 1–8, New York, NY,
USA, 2012. ACM.
[11] Apache cassandra. http://cassandra.apache.org/.
[12] Mongodb. http://www.mongodb.org/.
[13] Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn,
Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing.
In Proceedings of the 2010 ACM SIGMOD International Conference on
Management of data, SIGMOD ’10, pages 135–146, New York, NY, USA, 2010.
ACM.
[14] George Karypis and Vipin Kumar. Multilevel graph partitioning schemes. In Proc.
24th Intern. Conf. Par. Proc., III, pages 113–122. CRC Press, 1995.
[15] George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for
partitioning irregular graphs. SIAM J. Sci. Comput., 20(1):359–392, December 1998.
[16] George Karypis and Vipin Kumar. Multilevel k-way partitioning scheme for irregular
graphs. Journal of Parallel and Distributed Computing, 48:96–129, 1998.
[17] Irene Moulitsas and George Karypis. Architecture aware partitioning algorithms. In
Proceedings of the 8th international conference on Algorithms and Architectures for
Parallel Processing, ICA3PP ’08, pages 42–53, Berlin, Heidelberg, 2008. Springer-
Verlag.
[18] Bin Shao, Haixun Wang, and Yanghua Xiao. Managing and mining large graphs:
systems and implementations. In SIGMOD Conference, pages 589–592, 2012.
[19] Josep M. Pujol, Vijay Erramilli, and Pablo Rodriguez. Divide and conquer: Partitioning
online social networks. Technical report, 2009.
[20] Jayanta Mondal and Amol Deshpande. Managing large dynamic graphs efficiently.
In Proceedings of the 2012 ACM SIGMOD International Conference on Management
of Data, SIGMOD ’12, pages 145–156, New York, NY, USA, 2012. ACM.
[21] Adam Silberstein, Jeff Terrace, Brian F. Cooper, and Raghu Ramakrishnan. Feeding
frenzy: selectively materializing users’ event feeds. In Proceedings of the 2010 ACM
SIGMOD International Conference on Management of data, SIGMOD ’10, pages
831–842, New York, NY, USA, 2010. ACM.
[22] Adam Silberstein, Ashwin Machanavajjhala, and Raghu Ramakrishnan. Feed following:
the big data challenge in social applications. In Databases and Social Networks,
DBSocial ’11, pages 1–6, New York, NY, USA, 2011. ACM.
[23] Sudarshan Kadambi, Jianjun Chen, Brian F. Cooper, David Lomax, Raghu Ramakrishnan,
Adam Silberstein, Erwin Tam, and Hector Garcia-Molina. Where in the
world is my data? PVLDB, 4(11):1040–1050, 2011.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61050-
dc.description.abstract對於線上社群網路系統而言,如何有效率的管理及查詢大量的圖架構資料是一個日漸重要的問題,特別是要支援大量更新及取得資料的使用者要求。為了解決這個問題,系統會將社群圖資料切割分散到各個機器內,以便平行處理使用者的要求;而在這個狀況之下,處理大量要求的所需總時間以及處理單個要求的回應時間都被處理最慢的機器所限制住。然而,在過去的研究中大多都致力於降低在處理要求時所需的跨機器傳輸量,而非分散負載量以達到消除瓶頸及分攤時間成本。另外在雲端運算的幫助下,提供服務者可低成本地以動態租用不同等級之機器來逐漸升級硬體環境,所以我們在分散負載量時也應該注意到異質性的硬體環境。
在本篇論文中,我們提出理論性的分析以及提出我們的系統設計,其中包含三個在運行時提供動態調整的啟發式的方法,用以降低系統處理大量使用者要求所需的時間。首先,我們提出圖分割的方法以決定使用者的主資料及要求將由哪一台機器負責。第二,我們提出副本策略輔助系統做較佳的副本決定。最後,我們提出遠端讀取方針來輔助系統從多個遠端讀取候選機器中選取較佳的讀取目標。我們也運用了模擬器來評估我們的方法,而結果顯示我們的方法顯著地降低了系統處理大量要求的所需時間。
zh_TW
dc.description.abstractThere is an increasing need for the systems of Online Social Networks (OSNs) to manage and query large volumes of graph-structured data efficiently, especially to support the massive {em update} and {em fetch} queries raised by users. To cope with this, the system partitions the social graph data across a set of machines and processes the user queries in parallel; therefore, the total time spent for processing all queries and the response time of queries are bounded by the machine which takes longest time. However, although there is much work on reducing the total inter-machine traffic during processing queries, up to date there is little work on distributing the loads to eliminate the bottlenecks and amortize the time cost. Further, since it is cost-effective to scale-up the hardware environment by reserving different classes of machines dynamically with the aid of cloud computing, we should also give consideration to the resulting heterogeneous environment when distributing the loads.
In this paper, we provide theoretic analysis and propose the design of social graph data management system including three heuristic techniques to minimize the low total time spent for processing massive user queries at runtime. First, we propose a graph partitioning method to make the decision that which machine is in charge of the data and queries of each user. Second, we propose a replication strategy that dictate how replication decisions should be made. Finally, we propose a remote read policy that choose one from all remote replicas to read. We evaluate our method on simulator, and show that our system reduces the time spent significantly.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T10:43:32Z (GMT). No. of bitstreams: 1
ntu-102-R00944035-1.pdf: 1367012 bytes, checksum: c4570c8030877cee5562f9cfe548ff9a (MD5)
Previous issue date: 2013
en
dc.description.tableofcontents口試委員會審定書i
誌謝ii
摘要iii
Abstract iv
1 Introduction 1
2 Related Work 5
3 System Analysis 8
3.1 Query Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.1 Update Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.2 Fetch Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Techniques of Social Graph Data Manager . . . . . . . . . . . . . . . . . 10
3.2.1 Graph Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Data Replication Strategy . . . . . . . . . . . . . . . . . . . . . 11
3.2.3 Remote Read Policy . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Resource Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Physical Links . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Problem Statement 15
4.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3.1 Arrival Rates for Links . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.2 Arrival Rates for Machines . . . . . . . . . . . . . . . . . . . . . 20
4.3.3 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Heuristic Method 24
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 During Fetch Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 After Update Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3.1 Replication Strategy . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3.2 Graph Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Evaluation 30
6.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3 Algorithms for comparison . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.4.1 Varying the Link Bandwidth . . . . . . . . . . . . . . . . . . . . 31
6.4.2 Varying the Machine Deployment . . . . . . . . . . . . . . . . . 35
6.4.3 Varying the Number of Machines . . . . . . . . . . . . . . . . . 39
7 Conclusion 41
Bibliography 42
dc.language.isoen
dc.subject圖資料庫zh_TW
dc.subject要求處理zh_TW
dc.subject平行資料庫zh_TW
dc.subject線上社群網路zh_TW
dc.subject資料中心zh_TW
dc.subjectonline social networken
dc.subjectgraph databaseen
dc.subjectdatacenteren
dc.subjectparallel databases and query processingen
dc.title以架構自覺之社群圖管理達到有效率地更新及取得資料zh_TW
dc.titleArchitecture Aware Social Graph Management in Datacenters for Updating and Fetching Data Efficientlyen
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee蔡子傑,逄愛君,廖婉君,吳曉光
dc.subject.keyword線上社群網路,圖資料庫,資料中心,平行資料庫,要求處理,zh_TW
dc.subject.keywordonline social network,graph database,datacenter,parallel databases and query processing,en
dc.relation.page46
dc.rights.note有償授權
dc.date.accepted2013-08-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
1.33 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved