請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56119
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 林守德(Shou-De Lin) | |
dc.contributor.author | JUI-PIN WANG | en |
dc.contributor.author | 王瑞斌 | zh_TW |
dc.date.accessioned | 2021-06-16T05:16:02Z | - |
dc.date.available | 2018-09-23 | |
dc.date.copyright | 2014-09-23 | |
dc.date.issued | 2014 | |
dc.date.submitted | 2014-08-18 | |
dc.identifier.citation | Bibliography
[1] Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, and Ming-Syan Chen. LeeWave: level-wise distribution of wavelet coefficients for processing knn queries over distributed streams. Proc. VLDB Endow., 2008. [2] Apostolos N. Papadopoulos and Yannis Manolopoulos. Distributed processing of simi- larity queries. Distributed and Parallel Databases, 2001. [3] Jui-Pin Wang, Yu-Chen Lu, Mi-Yen Yeh, Shou-De Lin, and Phillip B. Gibbons. Communication-efficient distributed multiple reference pattern matching for m2m sys- tems. Proc. of IEEE ICDM, 2013. [4] Sylvia Ratnasamy, Paul Francis, Mark Handley, and Richard Karp Scott Shenker. A scalable content-addressable network. SIGCOMM, 2001. [5] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. SIGCOMM, 2001. [6] Adina Crainiceanu, Prakash Linga, Ashwin Machanavajjhala, Johannes Gehrke, and Jayavel Shanmugasundaram. P-ring: an efficient and robust p2p range index structure. SIGMOD, 2007. [7] Ashwin R. Bharambe, Mukesh Agrawal, and Srinivasan Seshan. Mercury: Supporting scalable multi-attribute range queries. SIGCOMM, 2004. [8] Erik Buchmann and Klemens Bohm. Efficient evaluation of nearest-neighbor queries in content-addressable networks. From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments, 2005. [9] Farnoush Banaei-Kashani and Cyrus Shahabi. Swam: A family of access methods for similarity-search in peer-to-peer data networks. CIKM, 2004. [10] Parisa Haghani, Sebastian Michel, Philippe Cudre-Mauroux, and Karl Aberer. Lsh at large – distributed knn search in high dimensions. International Workshop on Web and Databases, 2008. [11] Zaiwen Wen and Wotao Yin. Optimization with orthogonality constraints. http://optman.blogs.rice.edu/, Aril 2012. [12] Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Bran- don Westover, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In Proc. of ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2012. [13] Hervé Jégou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011. [14] Nitish Srivastava and Ruslan R. Salakhutdinov. Multimodal learning with deep boltz- mann machines. In Neural Information Processing Systems, 2012. [15] Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. The million song dataset. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011), 2011. [16] ThomasLidyandAndreasRauber.Evaluationoffeatureextractorsandpsycho-acoustic transformations for music genre classification. In Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR 2005), pages 34–41, London, UK, September 11-15 2005. [17] A. Rauber, E. Pampalk, and D. Merkl. The SOM-enhanced JukeBox: Organization and visualization of music collections based on perceptual models. Journal of New Music Research, 2003. [18] A.RauberandM.Frühwirth.Automaticallyanalyzingandorganizingmusicarchives.In Proceedings of the 5. European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2001), Springer Lecture Notes in Computer Science, Darm- stadt, Germany, Sept. 4-8 2001. Springer. http://www.ifs.tuwien.ac.at/ ifs/research/publications.html. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56119 | - |
dc.description.abstract | With the popularity of mobile devices and Internet of Things, similarity search such as image querying or query by humming among data distributed in different machines has become an increasingly important problem. Moreover, in a distributed environment, the transmission cost is usually much more crucial than the computation cost as generally it consumes more resource to in transmission. Thus, this thesis proposes a framework to conduct a bandwidth-efficient, exact k-nearest neighbor search amount a large number of distributed machines through exploiting the techniques of the orthogonal transformation. On the basis of our previous work, new bounds on the Euclidean distance have been derived to prune impossible instances in the early stages of searching using partial information of the query. Moreover, three additional enhancements are devised to further save the communication cost. The experiment results show that our method significantly outperforms other competitions in bandwidth consumption in large-scale experiments of millions of instances. | en |
dc.description.provenance | Made available in DSpace on 2021-06-16T05:16:02Z (GMT). No. of bitstreams: 1 ntu-103-R01922165-1.pdf: 2970735 bytes, checksum: 9283165c0c8173a59cf2b6f1e10f3565 (MD5) Previous issue date: 2014 | en |
dc.description.tableofcontents | 口試委員會審定書 i
致謝 ii 中文摘要 iii Abstract iv Contents v List of Figures viii List of Tables ix 1 Introduction 1 2 Related Work 4 2.1 ConcurrentProcessing ............................. 4 2.2 ProbabilisticProcessing............................. 4 2.3 LeeWave .................................... 5 2.4 MsWave.............. 2.5 Others............... 3 Methodology 3.1 ProblemSetup........... 3.2 Overview of Our Framework . . . 3.3 FirstPhase.................................... 9 3.3.1 DefinitionofOrtohogonalTransformation. . . . . . . . . . . . . . . 9 3.3.2 PropertyofOrthogonalTransformation . . . . . . . . . . . . . . . . 9 3.4 EnhancetheBoundsbytheOrthogonalTransformation . . . . . . . . . . . . 9 3.4.1 TheGoaloftheFirstPhase....................... 10 3.4.2 DefinitionoftheBounds........................ 10 3.4.3 RelationBetweentheNormsandtheBounds . . . . . . . . . . . . . 11 3.4.4 EquivalentBoundsAfterTransformation . . . . . . . . . . . . . . . 12 3.4.5 Reduction of the Norm with Orthogonal Transformation . . . . . . . 13 3.4.6 OptimizationoverOrthogonalConstraints. . . . . . . . . . . . . . . 13 3.4.7 ReducetheCostofSendingMatrices ................. 14 3.5 SecondPhase .................................. 16 3.5.1 PruningtheCandidateswiththeBounds . . . . . . . . . . . . . . . . 16 3.5.2 DerivationofourBounds........................ 17 3.5.3 CalculationofourBounds ....................... 18 3.5.4 Finding the Threshold in Distributed Machines . . . . . . . . . . . . 19 3.6 DecisionofthePivots.............................. 20 3.6.1 EstimationoftheNumberofResidualMachines . . . . . . . . . . . 20 3.6.2 EstimationoftheTransmissionCost.................. 21 3.6.3 CoordinateDescenttoDecidethePivots. . . . . . . . . . . . . . . . 22 3.7 OverallFramework............................... 23 3.7.1 Importance-SelectingFunction..................... 23 3.7.2 OverallFramework........................... 23 4 Experiment 25 4.1 ExperimentSetup................................ 25 4.2 DataDescription ................................ 25 4.2.1 TimeSeriesData............................ 25 4.2.2 ImageData............................... 26 4.2.3 AudioData............................... 26 4.3 ComparisonAmongallFrameworks...................... 26 4.3.1 FrameworksforComparison...................... 26 4.3.2 ResultsofDifferentFrameworks.................... 27 4.4 Comparison Among our Framework with Different Configurations . . . . . . 31 4.4.1 ConfigurationsforComparison .................... 31 4.4.2 ResultsofDifferentConfigurations .................. 31 4.5 NumberofQueriesforAmortizingtheCostofMatrices. . . . . . . . . . . . 34 4.6 PowerofthePruningProcedure ........................ 36 4.6.1 MethodswithDifferentBounds .................... 36 4.6.2 ResultsofPruning ........................... 36 5 Conclusions 41 Bibliography 43 | |
dc.language.iso | en | |
dc.title | 在分散式環境下以正交轉換技術節省頻寬之最近鄰查詢演算法 | zh_TW |
dc.title | Exact kNN-Search with Orthogonal Transformation in the Distributed Environment | en |
dc.type | Thesis | |
dc.date.schoolyear | 102-2 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 葉彌妍(Mi-Yen Yeh) | |
dc.contributor.oralexamcommittee | 許永真(Yung-jen Hsu),鄭卜壬(Pu-Jen Cheng),陳銘憲(Ming-Syan Chen),陳倩瑜(Chien-Yu Chen) | |
dc.subject.keyword | 確切相似性搜索,剪枝,傳輸量節省,正交轉換,最優化問題, | zh_TW |
dc.subject.keyword | Exact Similarity Search,Pruning,Bandwidth-efficient,Orthogonal Transformation,Optimization, | en |
dc.relation.page | 45 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2014-08-18 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-103-1.pdf 目前未授權公開取用 | 2.9 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。