於機率性資料庫中選擇具影響力物件之技術

Yu-Chieh Lin; 林與絜

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60358

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳銘憲(Ming-Syan Chen)
dc.contributor.author	Yu-Chieh Lin	en
dc.contributor.author	林與絜	zh_TW
dc.date.accessioned	2021-06-16T10:16:15Z	-
dc.date.available	2018-08-25
dc.date.copyright	2013-08-25
dc.date.issued	2013
dc.date.submitted	2013-08-18
dc.identifier.citation	[1] S. Abiteboul, P. C. Kanellakis, and G. Grahne, ”On the Representation and Querying of Sets of Possible Worlds,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 1987. [2] P. K. Agarwal, A. Efrat, S. Sankararaman, and W. Zhang, ”Nearest-Neighbor Searching under Uncertainty,” Proc. 31st ACM Symposium on Principles of Database Systems (PODS), 2012. [3] C. C. Aggarwal, J. Han, J. Wang and P. S. Yu, ”A Framework for Clustering Evolving Data Streams,” Proc. 29th Int’l Conf. on Very Large Data Bases (VLDB), 2003. [4] C. C. Aggarwal and P. S. Yu, ”A Framework for Clustering Uncertain Data Streams,” Proc. 24th IEEE Int’l Conf. on Data Engineering (ICDE), 2008. [5] C.C. Aggarwal and P.S. Yu, ”Outlier Detection with Uncertain Data,” Proc. 8th SIAM Int’l Conf. on Data Mining (SDM), 2008. [6] C. C. Aggarwal and P. S. Yu, ”A Survey of Uncertain Data Algorithms and Applications,” IEEE Trans. on Knowledge and Data Engineering (TKDE), vol. 21, no. 5, pp. 609-623, May 2009. [7] R. Akbarinia, P. Valduriez, and G. Verger, ”Efficient Evaluation of SUM Queries over Probabilistic Data,” IEEE Trans. on Knowledge and Data Engineering (TKDE), vol.25, no. 4, pp.764-775, Apr. 2013. [8] M. Ankerst, M. M. Breunig, H.-P. Kriegel and J. Sander, ”OPTICS: Ordering Points to Identify the Clustering Structure,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 1999. [9] M. Azizyan, I. Constandache, and R. Roy Choudhury, ”SurroundSense: Mobile Phone Localization via Ambience Fingerprinting,” Proc. 15th Annual Int’l Conf. on Mobile Computing and Networking (MobiCom), 2009. [10] P. Bahl and V. Padmanabhan, ”RADAR: An In-Building RF-based User Location and Tracking System,” Proc. 19th IEEE Int’l Conf. on Computer Communications (INFOCOM), 2000. [11] G. Beskales, M. A. Soliman, and I. F. Ilyas, ”Efficient Search for the Top-k Probable Nearest Neighbors in Uncertain Databases,” Proc. 34th Int’l Conf. on Very Large Data Bases (VLDB), 2008. [12] J. Bi and T. Zhang, ”Support Vector Machines with Input Data Uncertainty,” Proc. Advances in Neural Information Processing Systems (NIPS), 2004. [13] C. Bohm, F. Fiedler, A. Oswald, C. Plant, and B. Wackersreuther, ”Probabilistic Skyline Queries,” Proc. 18th ACM Conf. on Information and Knowledge Management (CIKM), 2009. [14] K. Chang and S. Hwang, ”Minimal Probing: Supporting Expensive Predicates for Top-k Queries,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 2002. [15] J. Chen and R. Cheng, ”Quality-Aware Probing of Uncertain Data with Resource Constraints,” Proc. 20th Int’l Conf. on Scientific and Statistical Database Management (SSDBM), 2008. [16] J. Chen, R. Cheng, M. F. Mokbel, and C.-Y. Chow, ”Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data,” Proc. 35th Int’l Conf. on Very Large Data Bases (VLDB), 2009. [17] W. Chen, C. Wang, and Y. Wang, ”Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks,” Proc. 16th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2010. [18] R. Cheng, L. Chen, J. Chen, and X. Xie, ”Evaluating Probability Threshold k-Nearest-Neighbor Queries over Uncertain Data,” Proc. 12th Int’l Conf. on Extending Database Technology (EDBT), 2009. [19] R. Cheng, J. Chen, and X. Xie, ”Cleaning Uncertain Data with Quality Guarantees,” Proc. 34th Int’l Conf. on Very Large Data Bases (VLDB), 2008. [20] R. Cheng, D.V. Kalashnikov, and S. Prabhakar, ”Evaluating Probabilistic Queries over Imprecise Data,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 2003. [21] R. Cheng, E. Lo, X. S. Yang, M.-H. Luk, X. Li, and X. Xie, ”Explore or Exploit? Effective Strategies for Disambiguating Large Databases,” Proc. 36th Int’l Conf. on Very Large Data Bases (VLDB), 2010. [22] R. Cheng, D. V. Kalashnikov, and S. Prabhakar, ”Querying Imprecise Data in Moving Object Environments,” IEEE Trans. on Knowledge and Data Engineering (TKDE), vol. 16, no. 9, pp. 1112-1127, Sep. 2004. [23] R. Cheng, Y. Xia, S. Prabhakar, R. Shah, and J. S. Vitter, ”Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data,” Proc. 30th Int’l Conf. on Very Large Data Bases (VLDB), 2004. [24] C. Chui, B. Kao, and E. Hung, ”Mining Frequent Itemsets from Uncertain Data,” Proc. 11th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD), 2007. [25] H. Deng, I. King, and M.R. Lyu, ”Formal Models for Expert Finding on DBLP Bibliography Data,” Proc. of 8th IEEE Int’l Conf. on Data Mining (ICDM), 2008. [26] A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein and W. Hong, ”Model-Driven Data Acquisition in Sensor Networks,” Proc. 30th Int’l Conf. on Very Large Data Bases (VLDB), 2004. [27] J. Dougherty, R. Kohavi, and M. Sahami, ”Supervised and Unsupervised Discretization of Continuous Features,” Proc. 12th Int’l Conf. on Machine Learning (ICML), 1995. [28] W. Elmenreich, ”Fusion of Continuous-valued Sensor Measurements using Confidence Weighted Averaging,” Journal of Vibration and Control (JVC), vol. 13, no. 9-10, pp.1303-1312, Sep. 2007. [29] M. Ester, H.-P. Kriegel, J. Sander and X. Xu, ”A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” Proc. 2nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 1996. [30] C. Faloutsos, K.S. McCurley, and A. Tomkins, ”Fast Discovery of Connection Subgraphs,” Proc. 10th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2004. [31] N. Fuhr and T. Rolleke, ”A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems,” ACM Trans. on Information Systems (TOIS), vol. 15, pp. 32-66, 1997. [32] T. Ge and S. B. Zdonik, ”Handling Uncertain Data in Array Database Systems,” Proc. 24th IEEE Int’l Conf. on Data Engineering (ICDE), 2008. [33] A. Goel, S. Guha, and K. Munagala, ”Asking the right questions: Model-driven Optimization using Probes,” Proc. 25th ACM Symposium on Principles of Database Systems (PODS), 2006. [34] A. Goel, S. Guha, and K. Munagala, ”How to Probe for an Extreme Value,” ACM Trans. on Algorithms (TALG), vol. 7, issue 1, Nov. 2010. [35] S. Gunnemann, H. Kremer, and T. Seidl, ”Subspace Clustering for Uncertain Data,” Proc. 10th SIAM Int’l Conf. on Data Mining (SDM), 2010. [36] D. Kempe, J. Kleinberg, and E. Tardos, ”Maximizing the Spread of Influence Through a Social Network,” Proc. 9th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2003. [37] M. E. Khalefa, M. F. Mokbel, and J. J. Levandoski, ”Skyline Query Processing for Uncertain Data,” Proc. 19th ACM Conf. on Information and Knowledge Management (CIKM), 2010. [38] M. Kimura, K. Saito, and R. Nakano, ”Extracting Influential Nodes for Information Diffusion on a Social Network,” Proc. 22nd AAAI Conf. on Artificial Intelligence (AAAI), 2007. [39] Y. Koren, S.C. North, and C. Volinsky, ”Measuring and Extracting Proximity in Networks,” Proc. 12th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2006. [40] H.-P. Kriegel, P. Kunath, and M. Renz, ”Probabilistic Nearest-Neighbor Query on Uncertain Objects,” Proc. 12th Int’l Conf. on Database Systems for Advanced Applications (DASFAA), 2007. [41] H.-P. Kriegel and M. Pfeifle, ”Density-Based Clustering of Uncertain Data,” Proc. 11th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2005. [42] H.-P. Kriegel and M. Pfeifle, ”Hierarchical Density Based Clustering of Uncertain Data,” Proc. of 5th IEEE Int’l Conf. on Data Mining (ICDM), 2005. [43] H.-P. Kriegel and M. Pfeifle, ”Measuring the Quality of Approximated Clusterings,” Proc. BTW’05, 2005. Proc. 11th GI-Symposium Database Systems for Business, Technology and Web (BTW), 2005. [44] T. Lappas, K. Liu, and E. Terzi, ”Finding a Team of Experts in Social Networks,” Proc. 15th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2009. [45] M. Ley, The DBLP (Digital Bibliography and Library Project) Computer Science Bibliography, http://www.informatik.uni-trier.de/˜ley/db/ [46] J. Li and A. Deshpande, ”Ranking Continuous Probabilistic Datasets,” Proc. 36th Int’l Conf. on Very Large Data Bases (VLDB), 2010. [47] X. Lian and L. Chen, ”Efficient Processing of Probabilistic Reverse Nearest Neighbor Queries over Uncertain Data,” Int’l Journal on Very Karge Data Bases (VLDB Journal), vol. 18, no. 3, pp. 787-808, June 2009. [48] X. Liu, M. Ye, J. Xu, Y. Tian, and W.-C. Lee, ”K-Selection Query over Uncertain Data,” Proc. 15th Int’l Conf. on Database Systems for Advanced Applications (DASFAA), 2010. [49] H. Lu, W. Pan, N. Lane, T. Choudhury, and A. Campbell, ”Soundsense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones,” Proc. 7th Int’l Conf. on Mobile Systems, Applications, and Services (MobiSys), 2009. [50] L. V. S. Lakshmanan, N. Leone, R. Ross, and V. S. Subrahmanian, ”ProbView: A Flexible Probabilistic Database System,” ACM Trans. on Database Systems (TODS), vol. 22, no. 3, pp. 419-469, Sep. 1997. [51] Y. Ma, R. Hankins, and D. Racz, ”iLoc: a Framework for Incremental Locationstate Acquisition and Prediction Based on Mobile Sensors,” Proc. 18th ACM Conf. on Information and Knowledge Management (CIKM), 2009. [52] W. Ngai, B. Kao, C. Chui, R. Cheng, M. Chau and K. Y. Yip, ”Efficient Clustering of Uncertain Data,” Proc. of 6th IEEE Int’l Conf. on Data Mining (ICDM), 2006. [53] C. Olston, J. Jiang, and J. Widom, ”Adaptive Filters for Continuous Queries over Distributed Data Streams,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 2003. [54] M. Potamias, F. Bonchi, A. Gionis, and G. Kollios, ”K-Nearest Neighbors in Uncertain Graphs,” Proc. 36th Int’l Conf. on Very Large Data Bases (VLDB), 2010. [55] S. Singh, C. Mayfield, R. Shah, S. Prabhakar, S. E. Hambrusch, J. Neville, and R. Cheng, ”Database Support for Probabilistic Attributes and Tuples,” Proc. 24th IEEE Int’l Conf. on Data Engineering (ICDE), 2008. [56] M.A.Soliman and I.F. Ilyas, ”Ranking with Uncertain Scores,” Proc. 25th IEEE Int’l Conf. on Data Engineering (ICDE), 2009. [57] M. A. Soliman, I. F. Ilyas, and K. C. -C. Chang, ”Top-k Query Processing in Uncertain Databases,” Proc. 23th IEEE Int’l Conf. on Data Engineering (ICDE), 2007. [58] H. Tong, S. Papadimitriou, C. Faloutsos, P.S. Yu and T. Eliassi-Rad, ”BASSET: Scalable Gateway Finder in Large Graphs,” Proc. 14th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD), 2010. [59] Y. Wang, G. Cong, G. Song, and K. Xie, ”Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks,” Proc. 16th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD), 2010. [60] S. Wu, J. Tang, and B. Gao, ”Instant Social Graph Search,” Proc. 16th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD), 2012. [61] B. Yang, H. Lu, and C. S. Jensen, ”Probabilistic Threshold K Nearest Neighbor Queries over Moving Objects in Symbolic Indoor Space,” Proc. 13th Int’l Conf. on Extending Database Technology (EDBT), 2010. [62] S. M. Yuen, Y. Tao, X. Xiao, J. Pei, and D. Zhang, ”Superseding Nearest Neighbor Search on Uncertain Spatial Databases,” IEEE Trans. on Knowledge and Data Engineering (TKDE), vol. 22, no. 7, pp. 1041-1055, July 2010. [63] Q. Zhang and F. Li and K. Yi, ”Finding Frequent Items in Probailistic Data,” Proc. ACM SIGMOD Int’l Conf. on Management of Data (SIGMOD), 2008. [64] Y. Zhang, X. Lin, G. Zhu,W. Zhang, and Q. Lin, ”Efficient Rank Based KNN Query Processing over Uncertain Data,” Proc. 26th IEEE Int’l Conf. on Data Engineering (ICDE), 2010. [65] K. Zheng, Z. Huang, A. Zhou, and X. Zhou, ”Discovering the Most Influential Sites over Uncertain Data: A Rank-Based Approach,” IEEE Trans. on Knowledge and Data Engineering (TKDE), vol. 24, no. 12, pp. 2156-2169, Dec. 2012. [66] Z. Zou, J. Li, H. Gao, and S. Zhang, ”Frequent Subgraph Pattern Mining on Uncertain Graph Data,” Proc. 18th ACM Conf. on Information and Knowledge Management (CIKM), 2009.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60358	-
dc.description.abstract	In this dissertation, we study how to select influential objects in probabilistic databases. For an uncertain dataset with probabilistic attribute values, we would like to tell which objects can best improve query or mining results if we can acquire their exact attribute values. The problem is explored on both clustering and the Probabilistic k-Nearest-Neighbor (k-PNN) query. We carefully define the metrics for evaluating the quality of the results of clustering and k-PNN query, and then we design algorithms to find the solutions according to the metrics correspondingly. For the k-PNN query, we provide optimal solutions of acquisition for nearest-neighbor query (1-PNN), and we propose a scalable algorithm solving the acquisition for k-PNN query with k > 1. Besides, for a social network dataset with edge probabilities, we would like to tell which neighboring nodes of the query node can best help gather specific information if these nodes are asked. We carefully formulate the problem according to the motivated scenario, and the proposed approach considers both the strength and the diversity of a node’s influence. We conduct experiments on various datasets, and the experimental results demonstrate the effectiveness and the efficiency of the proposed approaches.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T10:16:15Z (GMT). No. of bitstreams: 1 ntu-102-F94921035-1.pdf: 1195650 bytes, checksum: d90311482284d1d21fa255b18193c22f (MD5) Previous issue date: 2013	en
dc.description.tableofcontents	1 Introduction 5 1.1 Motivation 5 1.2 Overview 6 1.2.1 Data Acquisition for Uncertain Clustering 6 1.2.2 Data Acquisition for Probabilistic Nearest-Neighbor Query 7 1.2.3 Guide Query in Sociel Networks 7 1.3 Organization 8 2 Data Acquisition for Uncertain Clustering 9 2.1 Introduction 9 2.2 Related Works 11 2.3 Problem Description 12 2.4 Selective Exact Value Acquisition for Clustering 14 2.4.1 Metric for Selection 15 2.4.2 Localized Metric for Selection 16 2.5 Experiments 17 2.5.1 The FDBSCAN Algorithm 17 2.5.2 Quality Evaluation 18 2.5.3 Experimental Setup and Results 19 2.6 Summary 22 3 Data Acquisition for Probabilistic Nearest-Neighbor Query 23 3.1 Introduction 23 3.2 Related Work 26 3.3 Problem Description 27 3.3.1 Problem Definition 28 3.3.2 Candidate Identification 30 3.4 Single Object Acquisition for 1-PNN 33 3.4.1 Problem Analysis 34 3.4.2 Representative New Answer of di,j ≤ d(1)max 37 3.4.3 Representative New Answer of di,j > d(1)max 40 3.4.4 Algorithm Design 42 3.5 Multiple Object Acquisition for 1-PNN 45 3.5.1 s-acquisition for 1-PNN 45 3.5.2 Set Quality Calculation 48 3.5.3 Set Quality Estimation 55 3.6 Acquisition for k-PNN 61 3.6.1 Major Answers and 1-Acquisition 62 3.6.2 Approach for s-Acquisition 65 3.6.3 Discussion 73 3.7 Experiments 75 3.7.1 Experimental Setup 76 3.7.2 Quality of Data Acquisition 76 3.7.3 Efficiency of Data Acquisition 78 3.7.4 Supplementary Experimental Results 81 3.8 Summary 85 4 Guide Query in Social Networks 86 4.1 Introduction 86 4.2 Proposed Framework 88 4.2.1 Problem Definition 88 4.2.2 The Basic Framework 89 4.2.3 The Advanced Framework 93 4.3 Experimental Evaluation 95 4.3.1 Implementation 95 4.3.2 Experimental Results 95 4.4 Related Work 99 4.5 Summary 100 5 Conclusion 101
dc.language.iso	en
dc.subject	機率性資料庫	zh_TW
dc.subject	群集分析	zh_TW
dc.subject	最鄰近k 點搜&#63850	zh_TW
dc.subject	社群網路	zh_TW
dc.subject	probabilistic database	en
dc.subject	social networks	en
dc.subject	nearest-neighbor query	en
dc.subject	clustering	en
dc.title	於機率性資料庫中選擇具影響力物件之技術	zh_TW
dc.title	Techniques for Selecting Influential Objects in Probabilistic Databases	en
dc.type	Thesis
dc.date.schoolyear	101-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	陳良弼(Arbee L.P. Chen),李官陵(Guanling Lee),李旺謙(Wang-Chien Lee),曾新穆(Vincent S. Tseng),楊得年(De-Nian Yang)
dc.subject.keyword	機率性資料庫,群集分析,最鄰近k 點搜&#63850,社群網路,	zh_TW
dc.subject.keyword	probabilistic database,clustering,nearest-neighbor query,social networks,	en
dc.relation.page	108
dc.rights.note	有償授權
dc.date.accepted	2013-08-18
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	1.17 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。