動態資料庫系統之查詢處理技術

Hao-Ping Hung; 洪顥賓

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30925

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳銘憲(Ming-Syan Chen)
dc.contributor.author	Hao-Ping Hung	en
dc.contributor.author	洪顥賓	zh_TW
dc.date.accessioned	2021-06-13T02:21:14Z	-
dc.date.available	2007-02-02
dc.date.copyright	2007-02-02
dc.date.issued	2007
dc.date.submitted	2007-01-30
dc.identifier.citation	[1] Sony Corporation. http://www.sony.co.jp/. [2] The UCR Time Series Data Mining Archive. http://www.cs.ucr.edu/ eamonn/TSDMA/. [3] S. Acharya, R. Alonso, M. J. Franklin, and S. B. Zdonik. Broadcast Disks: Data Management for Asymmetric Communications Environments. In Proceedings of the 1995 ACM International Conference on Management of Data, pages 199–210, May 1995. [4] S. Acharya, M. J. Franklin, and S. B. Zdonik. Disseminating updates on broadcast disks. In Proceedings of the 22th International Conference on Very LargeData Bases, pages 354– 365. Morgan Kaufmann, September 1996. [5] S. Acharya and S. Muthukrishnan. Scheduling on-demand broadcasts: New metrics and algorithms. In Proceedings of the 4th ACM/IEEE International Conference on Mobile Computing and Networking, pages 43–54, 1998. [6] C. C. Aggarwal, J. L. Wolf, and P. S. Yu. A Permutation-Based Pyramid Broadcasting Scheme for Video-on-Demand Systems. In IEEE Proceedings of Multimedia, 1996. [7] R. Agrawal,C. Faloutsos, andA. Swami. Efficient Similarity Search in Sequence Databases. In Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms, 1993. [8] D. Aksoy and M. J. Franklin. Scheduling for Large-Scale On-Demand Data Broadcasting. In Proceedings of IEEE INFOCOM, March 1998. [9] D. Aksoy, M. J. Franklin, and S. Zdonik. Data staging for on-demand broadcast. In Proceedings of the 27th International Conference on Very Large Data Bases, pages 571–580, September 2001. [10] D. Aksoy and M. S. Leung. Pull vs Push: a Quantitative Comparison for Data Broadcast. In Proceedings of IEEE GLOBECOM, 2004. [11] S. Arya, D.M.Mount, R. Silverman, and A. Y.Wu. An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions. Journal of the ACM, 1998. [12] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and Issues in Data Stream Systems. In Proc. of ACM PODS, 2002. [13] B. Babcock and C. Olston. Distributed top-k monitoring. In Proceedings of ACM SIGMOD, 2003. [14] W.-T. Balke,W. Nejdl,W. Siberski, and U. Thaden. Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks. In Proceedings of IEEE ICDE, 2005. [15] D. Barbará. Mobile computing and database - a survey. IEEE Transactions on Knowledge and Data Engineering, 11(1):108–117, 1999. [16] M. Bawa, R. J. Bayardo, S. Rajagopalan, and E. J. Shekita. Make it Fresh, Make it Quick - Searching a Network of Personal Webservers. In ACM WWW, 2003. [17] P. Boncz, T. Grust, M. van Keulen, S. Manegold, J. Rittinger, and J. Teubner. MonetDB/ XQuery: a Fast XQuery Processor Powered by a Relational Engine. In Proceedings of ACM SIGMOD, 2006. [18] A. Bulut and A. Singh. SWAT: hierarchical stream summarization in large networks. In Proc. of ICDE, 2003. [19] A. Bulut and A. Singh. A Unified Framework for Monitoring Data Streams in Real Time. In Proc. of ICDE, 2005. [20] K. Chakrabarti, E. Keogh, S. Mehrotra, and M. Pazzani. Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. ACM TODS, 27(2), 2002. [21] F. K. P. Chan, A. W. C. Fu, and C. Yu. Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping. IEEE TKDE, 15(3), 2003. [22] K. P. Chan and A. W. C. Fu. Efficient Time Series Matching by Wavelets. In Proceedings of IEEE ICDE, 1999. [23] S. Chaudhuri and L. Gravano. Optimizing Queries over Multimedia Repositories. In Proceedings of ACM SIGMOD, 1996. [24] S. Chaudhuri, L. Gravano, and A. Marian. Optimizing Top-K Selection Queries over Multimedia Repositories. IEEE TKDE, 16(8), 2004. [25] Y. S. Chen, Y. P. Hung, and C. S. Fuh. Fast Block Matching Algorithm Based on the Winner-Update Strategy. IEEE Transactions on Image Processing, 10(8), 2001. [26] Y.-Y. Chen, T. Suel, and A. Markowetz. Efficient Query Processing in Geographic Web Search Engines. In Proceedings of ACM SIGMOD, 2006. [27] R. Cheng, D. Kalashnikov, and S. Prabhakar. Evaluating Probabilistic Queries over Imprecise Data. In Proceedings of ACM SIGMOD, 2003. [28] R. Chengy, B. Kaox, S. Prabhakarz, A. Kwanx, and Y. Tuz. Adaptive Stream Filters for Entity-based Queries with Non-Value Tolerance. In Proceedings of VLDB, 2005. [29] Y. Chung and M. Kim. Efficient data placement for wireless broadcast. Distributed and Parallel Database, 9(2), March 2001. [30] Y. C. Chung, C. C. Chen, and C. Lee. Time Constrained Service on Air. In Proceedings of IEEE ICDCS, 2005. [31] Y. D. Chung and M. H. Kim. Qem: A scheduling method for wireless broadcast data. In Proceedings of the International Conference on Database Systems for Avanced Applications (DSFAA’99), pages 135–142, 1999. [32] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. McGraw-Hill. [33] G. Cormode, M. Garofalakis, and D. Sacharidis. Fast Approximate Wavelet Tracking on Streams. In Proceedings of EDBT 2006, 2006. [34] G. Cormode, S. Muthukrishnan, and I. Rozenbaum. Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling. In Proceedings of VLDB, 2005. [35] H. M. Dewan, S. J. Stolfo, M. Hernandez, and J.-J. Hwang. Predictive Dynamic Load Balancing of Parallel and Distributed Rule and Query Processing. In Proceedings of ACM SIGMOD, 1994. [36] S. Even and Y. Shiloah. NP-Completeness of Several Arrangement Problems. Department of Computer Science, Israel Institute of Technology, Haifa, Isreal, Tech. Rep., 1975. [37] R. Fagin, A. Lotem, and M. Naor. Optimal Aggregation Algorithms for Middleware. In Proc. of PODS, 2001. [38] G. H. Forman and J. Zahorjan. The Challenges of Mobile Computing. IEEE Transactions on Computers, 27(4), 1994. [39] L. Gao, Z. Yao, and X. S. Wang. Evaluating Continuous Nearest Neighbor Queries for Streaming Time Series via Pre-fetching. In Proceedings of ACM CIKM, 2002. [40] M. Garey and D. Johnson. A Guide to the Theory of NP-Completeness. Freeman. [41] M. Garofalakis, J. Gehrke, and R. Rastogi. Querying and Mining Data Streams: You Only Get One Look. In Proceedings of ACM SIGMOD, 2002. [42] M. Garofalakis and P. B. Gibbons. Probabilistic Wavelet Synopses. ACM TODS, 29(1), 2004. [43] M. Garofalakis and A. Kumar. Deterministic Wavelet Thresholding for Maximum-Error Metrics. In Proceedings of ACM PODS, 2004. [44] A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. J. Strauss. One-Pass Wavelet Decompositions of Data Streams. IEEE TKDE, 15(3), 2003. [45] L. Golab and M. T. ˝Ozsu. Issues in Data Stream Management. ACM SIGMOD Record, 32(2), 2003. [46] D. E. Goldberg. Genetic Algorithm in Search, Optimization and Machine Learning. Addison-Wesley Publishing, 1989. [47] G. Graefe. Query Evaluation Techniques for Large Databases. ACM Computing Surveys, 25(2), 1993. [48] S. Guha and B. Harb. Wavelet Synopsis for Data Streams: Minimizing Non-Euclidean Error. In Proc. of KDD, 2005. [49] S. Guha, C. Kim, and K. Shim. XWAVE: Approximate Extended Wavelets for Streaming Data. In Proc. of VLDB, 2004. [50] S. Guha, N. Koudas, and K. Shim. Approximation and Streaming Algorithms for Histogram Construction Problems. ACM TODS, 31(1), 2006. [51] S. Guha, K. Shim, and J.Woo. REHIST: Relative Error Histogram Construction Algorithms. In Proceedings of VLDB, 2004. [52] J. H. Holland. Adaption in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975. [53] M. J. Hsieh, M. S. Chen, and P. S. Yu. Integrating DCT and DWT for Approximating Cube Streams. In Proc. of ACM CIKM, 2005. [54] C.-H. Hsu, G. Lee, andA. L. P. Chen. ANear Optimal Algorithm for Generating Broadcast Programs on Multiple Channels. In Proceedings of the 10th ACM International Conference on Information and Knowledge Management, pages 303–309, November 2001. [55] C. H. Hsu, G. Lee, and A. L. P. Chen. An Efficient Algorithm for Near Optimal Data Allocation on Multiple Broadcast Channels. Distributed and Parallel Databases, 18(3), 2005. [56] C. L. Hu andM. S. Chen. On-Line Scheduling Sequential Objects for Dynamic Information Dissemination. In Proceedings of IEEE GLOBECOM, 2005. [57] J.-L. Huang andM.-S. Chen. Broadcast program generation for unordered queries with data replication. In Proceedings of ACM SAC, March 2003. [58] J.-L. Huang and M.-S. Chen. Broadcasting dependent data for ordered queries without replication in a multi-channel mobile environment. In Proceedings of the 19th IEEE International Conference on Data Engineering, March 2003. [59] J.-L. Huang, M.-S. Chen, and H.-P. Hung. A QoS-Aware Transcoding Proxy Using Ondemand Data Broadcasting. In Proceedings of IEEE INFOCOM, March 2004. [60] H. P.Hung, J. W. Huang, J. L. Huang, andM. S. Chen. Scheduling Dependent Items in Data Broadcasting Environments. In Proceedings of ACM SAC, 2006. [61] T. Imielinski and B. R. Badrinath. Wireless Mobile Computing: Challenges in Data Management. Communications of ACM, 37(10), 1994. [62] T. Imielinski, S. Viswanathan, and B. R. Badrinath. Data on air: Organization and access. IEEE Transactions on Knowledge and Data Engineering, 9(3):353–372, May/June 1997. [63] Y. E. Ioannidis. Query Optimization. ACM Computing Surveys, 28(1), 1996. [64] J.-L. Huang andM.-S. Chen. Dependent data broadcasting for unordered queries in a multiple channel mobile environment. IEEE Trans. on Knowledge and Data Engineering, 16(6), Jun. 2004. [65] H. V. Jagadish. Issues in Multimedia Databases. In Proceedings of ACM SIGMOD, 1993. [66] H. V. Jagadish, H. Jin, B. C. Ooi, and K.-L. Tan. Global Optimization of Histograms. In Proc. of SIGMOD, 2001. [67] M. Jarke and J. Koch. Query Optimization in Database Systems. ACM Computing Surveys, 16(2), 1984. [68] L.-S. Juhn and L.-M. Tseng. Fast Data Broadcasting and Receiving Scheme for Popular Video Service. IEEE Tran. on Broadcasting, 44(1), March 1998. [69] K. V. Kanth, D. Agrawal, and A. Singh. Dimensionality Reduction for Similarity Searching in Dynamic Databases. In Proceedings of ACM SIGMOD, 1998. [70] P. Karras and N. Mamoulis. One-Pass Wavelet Synopses for Maximum-Error Metrics. In Proceedings of VLDB, 2005. [71] F. Korn, H. Jagadish, and C. Faloutsos. Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proceedings of ACM SIGMOD, 1997. [72] F. Korn, S. Muthukrishnan, and D. Srivastava. Reverse Nearest Neighbor Aggregates Over Data Streams. In VLDB, pages 814–825, 2002. [73] N. Koudas, B. C. Ooi, K.-L. Tan, and R. Zhang. Approximate NN queries on Streams with Guaranteed Error/performance Bounds. In Proc. of VLDB, 2004. [74] C. C. Lee and Y. Leu. Efficient Data Broadcast Schemes for Mobile Computing Environments with Data Missing. Information Sciences, 172(3-4), 2005. [75] G. Lee, S. Lo, and A. Chen. Data Allocation on theWireless Broadcast Channel for Efficient Query Processing. IEEE Tran. on Computers, 51(10), 2002. [76] G. Lee and S. C. Lo. Broadcast Data Allocation for Efficient Access of Multiple Data Items in Mobile Environments. Mobile Networks and Applications, 8(4), 2003. [77] J. H. Lee, D. H. Kim, and C. W. Chung. Multi-Dimensional Selectivity Estimation Using Compressed Histogram Information. In Proceedings of ACM SIGMOD, 1999. [78] W.-C. Lee, Q. Hu, and D. L. Lee. A study on channel allocation for data dissemination in mobile computing environments. ACM/Baltzer Mobile Networks and Applications, Special Issue on Resource Management in Wireless Systems, 4(2):117–129, 1999. [79] I. Liabotis, B. Theodoulidis, and M. Saraaee. Improving Similarity Search in Time Series Using Wavelets. International Journal of Data Warehousing and Mining, 2(2), 2006. [80] J. Lin, M. Vlachos, E. Keogh, and D. Gunopulos. Iterative Incremental Clustering of Time Series. In Proceedings of EDBT, 2004. [81] C.M. Liu and K. F. Lin. Efficient Scheduling Algorthms for Disseminating Dependent Data in Wireless Mobile Environments. In Proceedings of IEEE International Conf. on Wireless Networks, Communications and Mobile Computing, 2005. [82] K. H. Liu, W. G. Teng, and M. S. Chen. Incremental Maintenance of Wavelet Synopses for Data Streams. In Proceedings of Workshop on Temporal Data Mining: Algorithms, Theory and Applications, 2005. [83] W. Loh, S. Kim, and K.Whang. Index Interpolation: an Approach to SubsequenceMatching Supporting Normalization Transform in Time-Series Databases. In Proceedings of ACM CIKM, 2000. [84] F. Martinez, J. Gonzalez, and I. Stojmenovic. A Parallel Hill Climbing Algorithm for Pushing Dependent Data in Clients-Providers-Servers Systems. In Proceedings of Computer and Communications, 2002. [85] C. Mathis, T. Harder, and M. Haustein. Locking-Aware Structural Join Operators for XML Query Processing. In Proceedings of ACM SIGMOD, 2006. [86] Y.Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of ACM SIGMOD, 1998. [87] Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programming. Springer Verlag, New York, 2nd Edition, 1994. [88] S. Michel, P. Triantafillou, and G. Weikum. KLEE: A Framework for Distributed Top-k Query Algorithms. In Proceedings of VLDB, 2005. [89] K. Mouratidis, M. Hadjieleftheriou, and D. Papadia. Conceptual Partitioning: An Efficient Method for Continuous Nearest Neighbor Monitoring. In Proc. ACM SIGMOD, 2005. [90] A. Nanopoulos, D. Katsaros, and Y. Manolopoulos. Effective Prediction of Web-User Accesses: A Data Mining Approach. In Proc. of WEBKDD Workshop, 2001. [91] V. Padmanabhan and L. Qiu. The Content and Access Dynamics of a Busy Web Site: Findings and Implications. In Proceedings of ACM SIGCOMM, 2000. [92] S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming Pattern Discovery in Multiple Time- Series. In Proc. of VLDB’05, 2005. [93] W.-C. Peng and M.-S. Chen. Dynamic generation of data broadcasting programs for a broadcast disk array in a mobile computing environment. In Proceedings of the 9th ACM International Conference on Information and Knowledge Management, November 2000. [94] W.-C. Peng and M.-S. Chen. Efficient Channel Allocation Tree Generation for Data Broadcasting in a Mobile Computing Environment. Wireless Networks, 9(2):117–129, 2003. [95] W.-C. Peng, J.-L. Huang, andM. S. Chen. Dynamic leveling: Adaptive data broadcasting in a mobile computing environment. Mobile Networks and Applications, 8(4):355–364, 2003. [96] C. S. Perng, H. Wang, S. R. Zhang, and D. S. Parker. Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases. In Proceedings of IEEE ICDE, 2000. [97] M. Petropoulos, A. Deutsch, and Y. Papakonstantinou. Interactive Query Formulation over Web Service-Accessed Sources. In Proceedings of ACM SIGMOD, 2006. [98] I. Popivanov and R. J. Miller. Similarity Search Over Time-Series Data Using Wavelets. In Proceedings of IEEE ICDE, 2002. [99] N. Prabhu and V. Kumar. Data Scheduling for Multi-item and Transactional Requests in On-demand Broadcast. In Proceedings of Mobile Data Management, 2005. [100] N. Roussopoulos and S. K. F. Vincent. Nearest Neighbor Queries. In Proceedings of ACM SIGMOD, 1995. [101] F. Sadri. Reliability of Answers to Queries in Relational Databases. IEEE TKDE, 3(2), 1991. [102] A. Si and H. V. Leong. Query Optimization for Broadcast Database. Data and Knowledge Engineering, 23(9), 1999. [103] E. J. Stollnitz, T. D. Derose, and D. H. Salesin. Wavelets for Computer Graphics: Theory and Application. Morgan Kaufmann, 1996. [104] C.-J. Su and L. Tassiulas. Broadcast scheduling for information distribution. In Proceedings of the 16th IEEE Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution. (INFOCOM’97), pages 109–117, 1997. [105] G. Syswerda. Uniform Crossover in Genetic Algorithms. In Proc. 3rd Int. Conf. on Genetic Algorithms, 1989. [106] S. Viswanathan and T. Imielinski. Pyramid Broadcasting for Video-on-Demand Service. In SPIE Multimedia Computing and Networking Conference, 1995. [107] J. S. Vitter and M. Wang. Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. In Proceedings of ACM SIGMOD, 1999. [108] C. Wang and X. S. Wang. Supporting Subseries Nearest Neighbor Search via Approximation. In Proceedings of ACM CIKM, 2000. [109] J. W. Wong. Broadcast delivery. Proceedings of The IEEE, 76(12):1566–1577, December 1988. [110] Y. Wu and G. Cao. Stretch-Optimal Scheduling for On-Demand Data Broadcasts. In Proc. of Tenth Int. Conf. on Computer Communications and Networks, 2001. [111] W.-G. Yee, S. B. Navathe, E. Omiecinski, and C. Jermaine. Efficient Data Allocation over Multiple Channels at Broadcast Servers. IEEE Transactions on Computers, 51(10):1231– 1236, October 2002. [112] C. T. Yu and C. C. Chang. On the Design of a Query Processing Strategy in a Distributed Database Environment. In Proceedings of ACM SIGMOD, 1983. [113] L. Y. Yuan and D.-A. Chiang. A Sound and Complete Query Evaluation Algorithm for Relational Databases with Null Values. In Proceedings of ACM SIGMOD, 1988. [114] Y. Zhao and S. Zhang. Generalized Dimension-Reduction Framework for Recent-Biased Time Series Analysis. IEEE TKDE, 18(2), 2006. [115] B. Zheng, X. Wu, X. Jin, and D. L. Lee. TOSA: a Near-Optimal Scheduling Algorithm for Multi-Channel Data Broadcast. In Proceedings of Mobile Data Management (MDM-05), 2005. [116] G. K. Zipf. Human Behaviour and the Principle of Least Effort. Addison-Wesley, Reading, MA, 1949.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30925	-
dc.description.abstract	科技的日新月異，讓電腦的運算功能不斷的提昇，對一處理大量資料的資料庫系統而言，執行效率以及執行效能也獲得顯著的改善。然而，對動態的環境，如行動資料庫以及資料流而言，在處理使用者查詢時，許多傳統的最佳化技術，都無法直接套用。由於在動態的資料庫系統中，使用者查詢的統計特性，或是儲存的資料內容，都會隨著時間不斷的作改變。因此，如何縮短處理查詢的時間，以及增加查詢結果的準確度，成為許多鑽研資料庫系統的學者，所研究的重要議題。在行動計算 (Mobile Computing) 的環境中，以廣播為基礎的傳送機制，是一種具可擴充性 (Scalability) 的傳送資料的方式。在資訊廣播系統中，廣播伺服器會根據使用者查詢的機率分佈，將資料分配到不同的廣播頻道中。而分配的原則，則是儘量將使用者的平均等候時間最小化。有鑒於不同的資料，會具有不同的檔案大小，在第二章裡，我們討論的主題是，在具多樣化資料的廣播環境中，資料排程演算法的設計。我們首先推導出在該環境下，使用者的等候時間之數學模型，並定義資料排程的問題。接下來，我們進一步提出了所謂的「二階段排程」 (Two-Phase Allocation) 的演算法架構。在此解法中，資料的排程，分為以下兩個主要階段：粗調及微調。在「粗調」的過程中，資料會以最快的方式，分配到不同的廣播頻段中；在微調時，粗調的結果，則是會被調整成區域最佳解 (Local Optimum)。除此之外，為了要克服區域最佳解的限制，我們又提出「混合式基因演算法」 (Hybrid Genetic Algorithm) 的系統架構，使排程結果達到整體最佳化 (Global Optimization)。在許多情況下，我們發現到無線網路的使用者，會同時想查詢不同的資料。假設使用者接收不同資料時，他或她會希望這些資料能依據某一特定順序做接收。在伺服器中，為了要讓使用者的等候時間可以最佳化，排程演算法的設計，則必須要根據這些資料在接收時的順序性。在第三章裡，我們探討的問題，是在所謂的「順序性廣播環境」 (Sequential Data Broadcasting Environments) 中，設計排程演算法。我們所提出的，是一個名為MULS的系統架構。這個系統架構裡，包含兩個元件：「線上排程演算法」(On-Line Scheduling Algorithm) 以及「最佳化程序」 (Optimization Procedure)。在「線上排程演算法」中，廣播的資料，會根據使用者查詢中的順序性，做初步的排程；而在第二階段裡，當最佳化程序在進行時，我們允許在伺服器端，利用不同的參數，來控制執行時間和執行效能。在資料流 (Data Stream) 的環境裡，隨著時間，資料會被不斷的接收，由於接收資料流的伺服器，只具備有限的磁碟空間，在儲存資料時，資料流會以「概要」 (Synopsis) 的方式，存在資料流管理系統 (Data Stream Management System) 中，而當我們在處理使用者查詢時，也只能根據「概要」中的內容，來計算這些查詢的答案。在第四章裡，我們探討的是在資料流的環境中，如何處理使用者所送出的「前K名等級查詢」 (Top-K Queries)。在這個章節中，我們針對此一特殊查詢，提出TOMS資料流管理系統的系統架構。在這個架構中，不論是儲存資料流，或是處理使用者查詢，我們皆設計出精巧的演算法來解決這些方面的議題。而在資料流管理系統中，資料流的概要中的值，通常會被透過「小波轉換」轉 (Wavelet Transform) 的方式，轉變成頻域 (Frequency Domain) 係數而加以儲存，當查詢送出後，使用者所希望得到的結果，卻往往是資料流在時域 (Time Domain) 中的表現。因此，要如何觀察這些頻域的係數，找出兩段資料流彼此在頻域中的相似度，則是我們在第五章中，所要討論的主題。這個章節主要分成兩個部份，第一個部份是資料流的相似度搜尋，而第二部份中，延續第一部份的研究成果，我們進一步探討如何找出「前K條最接近的資料流」 (K Nearest Neighboring Streams)。由於我們設計出來的方法，避免了將頻域係數轉換成時域係數所需消耗的運算負擔，因此，和許多現有的演算法相比，我們的演算法，能以更有效率的方式，計算出相關查詢的答案。關鍵詞：資料庫系統、使用者查詢、行動計算、資料流、資訊廣播、小波轉換	zh_TW
dc.description.abstract	The technology advances in computing powers have enhanced the capabilities for a database management system to process the user queries effectively and efficiently. However, in many emerging applications such as mobile computing environments and streaming environments, the conventional techniques for query optimization may suffer from the dynamics in either user queries or evolving data entities. For these dynamic database systems, how to process query within real time and provide accurate answers becomes a challenging issue. In the mobile computing environments, broadcast-based dissemination is a scalable way to deliver data items to interested users. According to the probability distribution of user queries, the broadcast server will allocate the items into broadcast channels in such a way the average waiting time of mobile users can be minimized. In view of the fact that various items with different sizes are disseminated in modern information service, we explore in Chapter 2 the issue of scheduling heterogeneous items in the data broadcasting environments. Given the broadcast database and the number of channels, we first derive the analytical model of the heterogeneous data broadcasting to obtain the average waiting time of mobile users, and formulate the problem as a grouping problem. In order to solve such problem, we propose a two-phase architecture to perform channel allocation. In addition to the two-phase architecture, we also propose algorithm GA-CDMS according to the concept of hybrid genetic algorithm for comparison purposes. Moreover, under many circumstances, the mobile users will tend to access multiple items within a specific session. Consider a special case in which the mobile users wish to download these items within a sequential order. To minimize the average access time, the information server should schedule these queries according to the sequential relationship among the items. In Chapter 3, we study the scheduling approach in such a sequential data broadcasting environment. Explicitly, we propose a general framework referred to as MULS for an information system. There are two primary stages in MULS: on-line scheduling and optimization procedure. By cooperating algorithm OLS with procedure SCI, the proposed MULS framework is able to generate broadcast programs with flexibility of providing different service qualities under different requirements of effectiveness and efficiency. As for the streaming environments, the rapid evolving speed and unlimited number of items make these data be scanned at most only once. Due to the resource limitation in the data stream environment, it has been reported that answering user queries according to the wavelet synopsis of a stream is an essential ability of a Data Stream Management System (DSMS). We first study in Chapter 4 the problem of maintaining the wavelet coefficients of multiple streams within collective memory so that the predetermined global error metric is minimized. Moreover, we also examine a promising application in the multistream environment, i.e., the top-k queries. We resolve the problem of efficient top-k query processing with minimized global error by developing a general framework. For the purposes of maintaining the wavelet coefficients and processing top-k queries, several well-designed algorithms are utilized to optimize the performance of each primary component of this general framework. In this dissertation, motivated by the fact that the data cells in streaming environments are usually transformed to coefficients in the frequency domain, we also attempt to address an essential problem 'How to obtain the time-domain similarity between two streams from wavelet coefficients in the frequency domain?'. In Chapter 5, we investigate two important types of range-constrained queries in time series streaming environments: the distance queries (which aim at obtaining the Euclidean distance between two streams) and the kNN queries (which aim at discovering k nearest neighbors to a reference stream). To achieve high efficiency in processing these two types of queries, we propose procedure RED and algorithm EKS. Compared to the existing methods in the prior research, the advantageous features of our approaches are in two folds. First, our approaches are capable of processing the queries directly from the wavelet synopses retained in the main memory without using IDWT to reconstruct the data cells. Moreover, our approaches enable the users to query the DSMS within their range of interest.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T02:21:14Z (GMT). No. of bitstreams: 1 ntu-96-F90942056-1.pdf: 1169353 bytes, checksum: f4e1805049521bbc028a2aee0dc0d322 (MD5) Previous issue date: 2007	en
dc.description.tableofcontents	1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Overview of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Processing Heterogeneous Queries in Broadcasting Environments . . . . . 2 1.2.2 Processing Sequential Queries in Broadcasting Environments . . . . . . . 3 1.2.3 Processing Top-k Queries in Streaming Environments . . . . . . . . . . . 4 1.2.4 ProcessingkNNQueries inStreamingEnvironments . . . . . . . . . . . . 6 1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Processing Heterogeneous Queries in Broadcasting Environments 8 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 RelatedWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Analytical Model of Heterogeneous Data Broadcasting . . . . . . . . . . . 13 2.2.3 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Two-phase Broadcast Program Generation . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Dimension Reduction Partitioning . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Cost-Sensitive Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 Cost-Diminishing Movement Selection . . . . . . . . . . . . . . . . . . . 21 2.4 Design of Genetic Algorithm in Generating Broadcast Programs . . . . . . . . . . 24 2.4.1 ChromosomeRepresentationandFitnessEvaluation . . . . . . . . . . . . 24 2.4.2 Crossover and Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.4.3 Architecture of GA-CDMS . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.1 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.2 Progressiveness of Two-Phase Channel Allocation . . . . . . . . . . . . . 28 2.5.3 PerformanceofHybridGeneticAlgorithm . . . . . . . . . . . . . . . . . 29 2.5.4 Scalability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.5.5 DiversityAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.5.6 SkewnessAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5.7 EfficiencyAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3 Processing Sequential Queries in Broadcasting Environments 37 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.1 RelatedWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.2 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3 Designof theMULSframeworkforMulti-LevelServiceQuality . . . . . . . . . . 46 3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3.2 On-Line Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.3 Sampling with Controlled Iteration . . . . . . . . . . . . . . . . . . . . . 51 3.3.4 ImplementationIssuesof theMULSframework . . . . . . . . . . . . . . . 57 3.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.4.1 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.2 Effectiveness Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.4.3 EfficiencyAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4.4 Effect of Control Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.4.5 Adaptiveness of the MULS Framework . . . . . . . . . . . . . . . . . . . 64 3.4.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4 Processing Top-k Queries in Streaming Environments 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2.1 Review of Wavelet Basics . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2.2 RelatedWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2.3 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Monitoring MultipleStreams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.3.1 Synopses of Multiple Streams . . . . . . . . . . . . . . . . . . . . . . . . 78 4.3.2 The W-Lists Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.4 Processing Top-k Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.4.1 AlgorithmBASIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.4.2 PSearch and PAWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.5 Execution Scenario of the TOMS Framework . . . . . . . . . . . . . . . . . . . . 91 4.5.1 Construction of Wavelet Synopses . . . . . . . . . . . . . . . . . . . . . . 91 4.5.2 Execution of algorithm PAWA . . . . . . . . . . . . . . . . . . . . . . . . 93 4.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.6.1 Simulating Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.6.2 Global Error of Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 95 4.6.3 Error of Answering Top-k Queries . . . . . . . . . . . . . . . . . . . . . . 96 4.6.4 Evaluating the Memory AccessCost . . . . . . . . . . . . . . . . . . . . . 98 4.6.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5 Processing kNN Queries in Streaming Environments 101 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.2.1 RelatedWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.2.2 Assumptions and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.3 Data Structures for Wavelet Coefficients . . . . . . . . . . . . . . . . . . . . . . . 107 5.3.1 Redefinition of Error Tree . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.3.2 The IGrid Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.4 Processing Distance Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.5 Searching Nearest Neighbors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.6.1 Simulation Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.6.2 Efficiency of the RED Procedure . . . . . . . . . . . . . . . . . . . . . . . 122 5.6.3 Efficiency of the EKS Algorithm . . . . . . . . . . . . . . . . . . . . . . . 124 5.6.4 Applications of the EKS Algorithm . . . . . . . . . . . . . . . . . . . . . 127 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 6 Conclusions 129
dc.language.iso	en
dc.subject	資料庫系統	zh_TW
dc.subject	使用者查詢	zh_TW
dc.subject	行動計算	zh_TW
dc.subject	資料流	zh_TW
dc.subject	資訊廣播	zh_TW
dc.subject	小波轉換	zh_TW
dc.subject	Wavelet Transform	en
dc.subject	Query Processing	en
dc.subject	Dynamic Databases	en
dc.subject	Mobile Computing	en
dc.subject	Data Streams	en
dc.title	動態資料庫系統之查詢處理技術	zh_TW
dc.title	Query Processing Techniques in Dynamic Database Systems	en
dc.type	Thesis
dc.date.schoolyear	95-1
dc.description.degree	博士
dc.contributor.oralexamcommittee	李強(Chiang Lee),陳良弼(Arbee Chen),李官陵(Guanling Lee),李素瑛(Suh-Yin Lee),陳宜欣(Yi-Shin Chen),呂永和(Yungho Leu),張嘉惠(Chia-Hui Chang)
dc.subject.keyword	資料庫系統,使用者查詢,行動計算,資料流,資訊廣播,小波轉換,	zh_TW
dc.subject.keyword	Query Processing,Dynamic Databases,Mobile Computing,Data Streams,Wavelet Transform,	en
dc.relation.page	137
dc.rights.note	有償授權
dc.date.accepted	2007-01-31
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-96-1.pdf 未授權公開取用	1.14 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。