請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57395
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 劉邦鋒(Pangfeng Liu) | |
dc.contributor.author | Chia-Wei Chang | en |
dc.contributor.author | 張家偉 | zh_TW |
dc.date.accessioned | 2021-06-16T06:44:22Z | - |
dc.date.available | 2015-06-02 | |
dc.date.copyright | 2014-07-31 | |
dc.date.issued | 2014 | |
dc.date.submitted | 2014-07-28 | |
dc.identifier.citation | [1] S. Adali, K. S. Candan, Y. Papakonstantinou, and V. S. Subrahmanian. Query caching and
optimization in distributed mediator systems. SIGMOD Record, 25(2):137--146, June 1996. [2] Gunnar Andersson and Lars Engebretsen. Better approximation algorithms for set splitting and not-all-equal sat. Information Processing Letters, 65(6):305 -- 311, 1998. [3] Apache Software Foundation. Cassandra. http://cassandra.apache.org. [4] Apache Software Foundation. HBase. http://hbase.apache.org. [5] Luitpold Babel, Hans Kellerer, and Vladimir Kotov. The k-partitioning problem. Mathemat- ical Methods of Operations Research, 47(1):59--82, 1998. [6] Richard Ernest Bellman. Dynamic Programming. Dover Publications, Incorporated, 2003. [7] M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya, W. F. Cody, R. Fagin, M. Flickner, A. W. Luniewski, W. Niblack, D. Petkovic, J. Thomas, J. H. Williams, and E. L. Wimmers. Towards heterogeneous multimedia information systems: The garlic approach. In Proceedings of the 5th International Workshop on Research Issues in Data Engineering-Distributed Object Management (RIDE-DOM'95), RIDE '95, pages 124--, Washington, DC, USA, 1995. IEEE Computer Society. [8] Upen S. Chakravarthy, John Grant, and Jack Minker. Logic-based approach to semantic query optimization. ACM Transactions on Database Systems, 15(2):162--207, June 1990. [9] Chao-Rui Chang, Meng-Ju Hsieh, Jan-Jan Wu, Po-Yen Wu, and Pangfeng Liu. Hsql: A highly scalable cloud database for multi-user query processing. In Rong Chang, editor, IEEE CLOUD, pages 943--944. IEEE, 2012. [10] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems, 26(2):4:1-- 4:26, June 2008. [11] Sophie Cluet and Claude Delobel. A general framework for the optimization of object- oriented queries. SIGMOD Record, 21(2):383--392, June 1992. [12] R. A. Cody and E. G. Coffman, Jr. Record allocation for minimizing expected retrieval costs on drum-like storage devices. Journal of the ACM, 23(1):103--115, January 1976. [13] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001. [14] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vo- gels. Dynamo: Amazon's highly available key-value store. SIGOPS Operating Systems Review, 41(6):205--220, October 2007. [15] J. Grant, J. Gryz, J. Minker, and L. Raschid. Semantic query optimization for object databases. In Data Engineering, 1997. Proceedings. 13th International Conference on, pages 444--453, Apr 1997. [16] Himawan Gunadhi and Arie Segev. A framework for query optimization in temporal databases. In Zbigniew Michalewicz, editor, Statistical and Scientific Database Manage- ment, volume 420 of Lecture Notes in Computer Science, pages 131--147. Springer Berlin Heidelberg, 1990. [17] L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In 23rd International Conference on Very Large Data Bases (VLDB 1997), 1997. [18] Oscar H. Ibarra and Chul E. Kim. Fast approximation algorithms for the knapsack and sum of subset problems. Journal of the ACM, 22(4):463--468, October 1975. [19] Yannis E. Ioannidis. Query optimization. ACM Computing Surveys, 28(1):121--123, March 1996. [20] Matthias Jarke and Jurgen Koch. Query optimization in database systems. ACM Computing Surveys, 16(2):111--152, June 1984. [21] Richard M Karp. Reducibility among combinatorial problems. Springer, 1972. [22] W. Kim. Object-oriented databases: Definition and research directions. IEEE Transactions on Knowledge and Data Engineering, 2(3):327--341, September 1990. [23] Jonathan J. King. Quist: A system for semantic query optimization in relational databases. In Proceedings of the Seventh International Conference on Very Large Data Bases - Volume 7, VLDB '81, pages 510--517. VLDB Endowment, 1981. [24] Ravi Krishnamurthy, Haran Boral, and Carlo Zaniolo. Optimization of nonrecursive queries. In VLDB, volume 86, pages 128--137, 1986. [25] Laks V. S. Lakshmanan and Hector J. Hernandez. Structural query optimization --- a uni- form framework for semantic query optimization in deductive databases. In Proceedings of the Tenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS '91, pages 102--114, New York, NY, USA, 1991. ACM. [26] Eugene L. Lawler. Fast approximation algorithms for knapsack problems. Mathematics of Operations Research, 4(4):339--356, 1979. [27] L. Lovasz. Coverings and colorings of hypergraphs. Proceedings of the Fourth Southeastern Conference on Combinatorics, Graph Theory, and Computing, pages 3--12, 1973. [28] Silvano Martello and Paolo Toth. Knapsack problems. Wiley New York, 1990. [29] R Garey Michael and David S Johnson. Computers and intractability: A guide to the theory of np-completeness. WH Freeman & Co., San Francisco, 1979. [30] M Tamer Ozsu and Patrick Valduriez. Principles of distributed database systems. Springer, 2011. [31] Prasan Roy, S. Seshadri, S. Sudarshan, and Siddhesh Bhobe. Efficient and extensible algo- rithms for multi query optimization. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD '00, pages 249--260, New York, NY, USA, 2000. ACM. [32] Sartaj Sahni. Approximate algorithms for the 0/1 knapsack problem. Journal of the ACM, 22(1):115--124, January 1975. [33] Michael Steinbrunn, Guido Moerkotte, and Alfons Kemper. Heuristic and randomized opti- mization for the join ordering problem. The VLDB Journal, 6(3):191--208, August 1997. [34] A. Swami. Optimization of large join queries: Combining heuristics and combinatorial tech- niques. SIGMOD Record, 18(2):367--376, June 1989. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57395 | - |
dc.description.abstract | 分散式資料庫集合了計算機網路中多個邏輯相關資料庫。以 BigTable 為例,其為一高可用性、高效率,且可擴展之分散式資料存儲系統。BigTable 將資料存儲為鍵值對,其中鍵由行、欄與時間戳記組成,值則為一字串。欄族 (column family) 為一群必須存儲於同一檔案的欄。若資料庫需要取得欄族中的某些欄,則必須將整個欄族的欄全部取出。
本論文探討將欄分割為數個欄族以優化一系列的查詢。於只考慮處理查詢時取得欄族的代價,且已知每個查詢需要取得之欄位,以及所有欄之代價的情況下,求出最低總代價方案。當需要取得欄族任一欄位時,必須將整個欄族全部取回,因此這是個有趣且有挑戰性的問題。 以下為本論文之研究貢獻。我們為欄位分割問題建構數學模型,並明確定義目標函數 與花費度量。我們證明最小化花費之欄族分割問題為 NP-complete,即便是分割為 2 個欄族或欄位花費皆相同亦如此。我們也證明最大化花費之欄族分割問題為 NP-complete。相對地,對於在分割為 2 個欄族,欄位花費皆相同,且分割必須為完美分割 (perfect split) 的限制下,最小化花費問題存在多項式時間演算法。我們在實驗結果中驗證了對於此問題所提出的多項式時間演算法之效率。 | zh_TW |
dc.description.abstract | A distributed database is a collection of multiple, logically interrelated databases distributed over a computer network. For exmaple, BigTable [10] is a distributed data storage system with high availability, performance, and scalability. BigTable stores data in a key-value pair, where the key consists of its column, row, and timestamp and the value is a string. A column family is a collection of columns that must be stored in a file, and the system must retrieve all columns in a column family if any column in that column family is needed.
The goal of this paper is to optimize a series of queries by partition the columns into several column families. Each query needs to access certain columns, and we would like to reduce the cost. We assume that that the cost of every column is given, and we consider only the cost in fetching column families while answering a query. Since retrieving any column within a column family means fetching the entire column family, this is an interesting yet challenging optimization problem. The contributions of this paper are summarized as follows. We formulate a mathematical formulation of the column partition problem in which the object function and cost measurement are clearly defined. We also prove that minimizing the expense of a column partition is NP-complete, even when The number of partition is 2, or the cost of each column is the same. We also show maximizing the expense of a column partition is also NP-complete. However, a polynomial time algorithm exists for minimizing the expense of a column 2-partition with uniform column cost and perfect split constraint. Experiment results that verify the performance of our proposed algorithm for minimizing the expense of a column 2-partition with uniform column cost and perfect split constraint. | en |
dc.description.provenance | Made available in DSpace on 2021-06-16T06:44:22Z (GMT). No. of bitstreams: 1 ntu-103-R01922163-1.pdf: 945011 bytes, checksum: 1f8d1ad552e89514cc7acebadc5c7c91 (MD5) Previous issue date: 2014 | en |
dc.description.tableofcontents | 口試委員會審定書 i
摘要 ii Abstract iii 1 Introduction 1 2 Related Works 4 3 Column Partition Problem 6 3.1 Minimizing Expense of Column Partition 7 3.1.1 Formal definition 7 3.1.2 2-Column Partition 7 3.1.3 2-Column Partition with Uniform Expense 9 3.2 Minimize Expense With Uniform Cost and Perfect Splitting 9 3.3 Maximizing Expense of Column Partition 11 4 Experiment 15 5 Conclusion 17 Bibliography 18 | |
dc.language.iso | en | |
dc.title | 批次資源調配演算法 | zh_TW |
dc.title | Resource Partition and Placement Strategy for Distributed Database Systems | en |
dc.type | Thesis | |
dc.date.schoolyear | 102-2 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 吳真貞(Jan-Jan Wu) | |
dc.contributor.oralexamcommittee | 徐慰中(Wei-Chung Hsu) | |
dc.subject.keyword | 資料庫,查詢優化, | zh_TW |
dc.subject.keyword | database,query optimization, | en |
dc.relation.page | 20 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2014-07-28 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-103-1.pdf 目前未授權公開取用 | 922.86 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。