距離及相關性資料探勘演算法的隱私權保護

Chun-Wei Su; 蘇俊維

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8905

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳銘憲
dc.contributor.author	Chun-Wei Su	en
dc.contributor.author	蘇俊維	zh_TW
dc.date.accessioned	2021-05-20T20:03:49Z	-
dc.date.available	2009-08-20
dc.date.available	2021-05-20T20:03:49Z	-
dc.date.copyright	2009-08-20
dc.date.issued	2009
dc.date.submitted	2009-08-18
dc.identifier.citation	[1] E. Achtert, C. Bohm, H.-P. Kriegel, P. Kroger, and A. Zimek. Robust, complete, and efficient correlation clustering. In SDM. SIAM, 2007. [2] C. Aggarwal, , C. C. Aggarwal, and P. S.Yu. Acondensation approach to privacy preserving data mining. In In EDBT, pages 183–199, 2004. [3] R. Agrawal and R. Srikant. Privacy-preserving data mining. SIGMOD Rec., 29(2):439–450, 2000. [4] F. Angiulli, S. Basta, and C. Pizzuti. Distance-based detection and prediction of outliers. IEEE Trans. on Knowl. and Data Eng., 18(2):145–160, 2006. [5] C. Bohm, K. Kailing, P. Kroger, and A. Zimek. Computing clusters of correlation connected objects. In SIGMOD Conference, pages 455–466. ACM, 2004. [6] K. Chen and L. Liu. Privacy preserving data classification with rotation perturbation. In ICDM’05: Proceedings of the Fifth IEEE International Conference on Data Mining, pages 589–592, Washington, DC, USA, 2005.IEEE Computer Society. [7] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), pages 226–231, 1996. [8] M. A. Hall. Correlation-based feature selection for discrete and numeric class machine learning. In Proc. 17th International Conf. on Machine Learning, pages 359–366. Morgan Kaufmann, San Francisco, CA, 2000. [9] A. Hyvarinen. Independent Component Analysis. New York: Wiley, 2001. [10] D. Kifer and J. Gehrke. l-diversity: Privacy beyond k-anonymity. In In ICDE, page 24, 2006. [11] E. M. Knorr, R. T. Ng, and V. Tucakov. Distance-based outliers: Algorithms and applications. VLDB Journal: Very Large Data Bases, 8(3–4):237–253, 2000. [12] N. Li and T. Li. t-closeness: Privacy beyond k-anonymity and l-diversity. In In Proc. of IEEE 23rd Int’l Conf. on Data Engineering, ICDE, 2007. [13] J. T. li Wang, X. Wang, K. ip Lin, D. Shasha, B. A. Shapiro, and K. Zhang. Evaluating a class of distance-mapping algorithms for data mining and clustering. In Knowledge Discovery and Data Mining, pages 307–311. ACM Press, 1999. [14] D. Lin, E. Bertino, R. Cheng, and S. Prabhakar. Position transformation: a location privacy protection method for moving objects. In SPRINGL ’08: Proceedings of the SIGSPATIAL ACM GIS 2008 International Workshop on Security and Privacy in GIS and LBS, pages 62–71, New York, NY, USA, 2008. ACM. [15] K. Liu and J. Ryan. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. on Knowl. and Data Eng., 18 (1):92–106, 2006. Senior Member-Kargupta, Hillol. [16] J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In L. M. L. Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281–297. University of California Press, 1967. [17] S. Mukherjee, Z. Chen, and A. Gangopadhyay. A privacy-preserving technique for euclidean distance-based mining algorithms using fourier-related transforms. The VLDB Journal, 15(4):293–315, 2006. [18] S. Ross. First Course in Probability. Prentice Hall, 2005. [19] L. Sweeney. k-anonymity: A model for protecting privacy. In In International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002. [20] V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin, and Y. Theodoridis. State-of-the- art in privacy preserving data mining. ACM SIGMOD Record, 1(33), 2004.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8905	-
dc.description.abstract	這篇論文設計出一個轉換方式使得當資料送到第三方被研究時還能保護到資料的隱私性。大部分傳統的轉換方式都有兩種限制，演算法侷限性與資訊量流失。在這篇論文中，我們提出了一個新穎的隱私權保護方式而沒有這兩種限制。這種轉換演算法我們稱之為FISIP: 一階和、二階和與內積維護。特別的是，我們將證明，藉由FISIP保護隱私資料的這三種性質（一階和、二階和與內積），當它被轉換成公開資料時，資料還能用在只依據這三種性質的所有演算法。由於距離與相關性能從這三種性質推導出來，因此，只依據距離與相關性的所有演算法依舊能被應用到。FISIP的評估有兩部分，第一部分是資料的有用性，第二部分是資料的強大性，這兩個目標本質上很難同時被達到。然而，從我們的實驗結果顯示，FISIP能同時滿足這兩個目標。總而言之，FISIP能提供一種轉換使得轉換前的原始資料與轉換後的公開資料的距離及相關性皆一致。當資料的隱私性被保護到時，資料的探勘品質在轉換後的（公開）資料能與轉換前的（隱私）資料達到一致。	zh_TW
dc.description.abstract	This paper devises a transformation scheme to protect data privacy in the case that data has to be sent to the third party for analysis purpose. Most conventional transformation schemes suffer from two limits, i.e. algorithm dependency and information loss. In this paper, we propose a novel privacy preserving scheme without these two limitations. This transformation algorithm is referred to as FISIP: FIrst and Second order sum and Inner product Preservation. Explicitly, as will be proved, by preserving three basic properties, (i.e. first order sum, second sum, and inner products) of private data, algorithms whose measures can be derived from the three properties can still be applied to public data transformed by FISIP. Specifically, distance and correlation can be derived from the three properties. Hence, distance-based algorithms and correlation-based algorithms can be applied. Evaluation of FISIP is done in two parts. The first part is data usefulness. The second part is data robustness. The two goals are intrinsically difficult to achieve at the same time. However, FISIP attains these two goals shown by our experimental results later. In all, FISIP is able to provide a transformation that preserves the distance and the correlation for the original private data after their transformation to the public data. As a result, while the privacy is protected, the mining quality from the transformed (public) data can be obtained to be the same as that from the original (private) data.	en
dc.description.provenance	Made available in DSpace on 2021-05-20T20:03:49Z (GMT). No. of bitstreams: 1 ntu-98-R96942126-1.pdf: 530895 bytes, checksum: a846768574a4f784d7f180fd4da49718 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	口試委員會審定書 # Acknowledgements i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 Chapter 2 Preliminaries 4 2.1 RelatedWork 4 2.2 ProblemDescription 6 Chapter 3 Theoretical Properties of FISIP 8 Chapter 4 Perfect FISIP Transformation 14 4.1 General Form Realization 14 4.2 FISIP Matrices for Fast Computation 14 4.3 Variation of Transformations 15 Chapter 5 Strong FISIP Transformation 16 5.1 Privacy Enhancement via Matrix Perturbation 16 Chapter 6 Dimension Adaptation 16 6.1 Up Dimension: from k to k + c 16 6.2 Down Dimension: from k to k − c 20 Chapter 7 Experimental Results 22 7.1 FISIP Preservation 22 7.2 Neighborhood Preservation 23 7.3 FISIP Preservation 27 7.4 Neighborhood Preservation 28 7.5 FISIP Preservation 29 Chapter 8 Conclusion 31 BIBLIOGRAPHY 32
dc.language.iso	en
dc.title	距離及相關性資料探勘演算法的隱私權保護	zh_TW
dc.title	Privacy Preservation for Distance and Correlation-based Mining Algorithms	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	曾新穆,鄧維光,沈錳坤,葉彌妍
dc.subject.keyword	資料探勘,隱私權保護,距離性,相關性,	zh_TW
dc.subject.keyword	data mining,privacy preserving,distance-based,correlation-based,	en
dc.relation.page	34
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2009-08-18
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf	518.45 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。