請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80229
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 郭斯彥(Sy-Yen Kuo) | |
dc.contributor.author | Yao-Wei Pai | en |
dc.contributor.author | 白曜瑋 | zh_TW |
dc.date.accessioned | 2022-11-24T03:02:54Z | - |
dc.date.available | 2021-08-06 | |
dc.date.available | 2022-11-24T03:02:54Z | - |
dc.date.copyright | 2021-08-06 | |
dc.date.issued | 2021 | |
dc.date.submitted | 2021-07-26 | |
dc.identifier.citation | [1] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to data mining. Pearson Education India, 2016. [2] C. M. Bishop, Pattern recognition and machine learning. springer, 2006. [3] T. J. Cleophas and A. H. Zwinderman, Machine learning in medicine-a complete overview. Springer, 2015. [4] D. Jurafsky and J. H. Martin, Speech and Language Processing (2ndEdition). USA: PrenticeHall, Inc., 2009. [5] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification and scene analysis. Wiley New York, 1973, vol. 3. [6] E. Fix and J. Hodges Jr, “Discriminatory analysis-nonparametric discrimination: Consistency properties,” CALIFORNIA UNIV BERKELEY, Tech. Rep., 1951. [7] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE transactions on information theory, vol. 13, no. 1, pp. 21–27, 1967. [8] S. Hansell, “Aol removes search data on vast group of web users,” New York Times, vol. 8, no. C4, 2006. [9] A. Narayanan and V. Shmatikov, “Robust deanonymization of large sparse datasets,” in 2008 IEEE Symposium on Security and Privacy (sp 2008). IEEE,2008, pp. 111–125. [10] P. Samarati and L. Sweeney, “Protecting privacy when disclosing information: kanonymity and its enforcement through generalization and suppression,” SRI Computer Science Laboratory, Tech. Rep., 1998. [11] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, “ldiversity: Privacy beyond kanonymity,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, no. 1, pp. 3–es, 2007. [12] N. Li, T. Li, and S. Venkatasubramanian, “tcloseness: Privacy beyond kanonymityand ldiversity,” in 2007 IEEE 23rd International Conference on Data Engineering. IEEE, 2007, pp. 106–115. [13] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography conference. Springer, 2006, pp. 265–284. [14] C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy.”Foundations and Trends in Theoretical Computer Science, vol. 9, no. 34, pp. 211–407, 2014. [15] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in 2013 IEEE 54th Annual Symposium on Foundations of ComputerScience. IEEE, 2013, pp. 429–438. [16] Ú. Erlingsson, V. Pihur, and A. Korolova, “Rappor: Randomized aggregatable privacypreserving ordinal response,” in Proceedings of the 2014 ACM SIGSA Cconference on computer and communications security, 2014, pp. 1054–1067. [17] Differential Privacy Team, “Learning with privacy at scale,” Apple, Tech. Rep.,2017. [18] B. Ding, J. Kulkarni, and S. Yekhanin, “Collecting telemetry data privately,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. CurranAssociates, Inc., 2017. [19] M. E. Gursoy, A. Inan, M. E. Nergiz, and Y. Saygin, “Differentially private nearest neighbor classification,” Data Mining and Knowledge Discovery, vol. 31, no. 5, pp.1544–1575, 2017. [20] T. Wang, J. Blocki, N. Li, and S. Jha, “Locally differentially private protocols for frequency estimation,” in 26th {USENIX} Security Symposium ({USENIX} Security17), 2017, pp. 729–745.[21] T. Wang, M. Lopuhaä-Zwakenberg, Z. Li, B. Skoric, and N. Li, “Locally differentially private frequency estimation with consistency,” in 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 2326, 2020. The Internet Society, 2020. [22] B. H. Bloom, “Space/time tradeoffs in hash coding with allowable errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970. [23] S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp.63–69, 1965. [24] Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, and K. Ren, “Heavy hitter estimation over setvalued data with local differential privacy,” in Proceedings of the 2016ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 192–203. [25] T. Wang, N. Li, and S. Jha, “Locally differentially private frequent itemset mining,” in 2018 IEEE Symposium on Security and Privacy (SP), 2018, pp. 127–143 [26] T. T. Nguyên, X. Xiao, Y. Yang, S. C. Hui, H. Shin, and J. Shin, “Collecting and analyzing data from smart device users with local differential privacy,”arXiv preprint arXiv:1606.05053, 2016.[27] N. Wang, X. Xiao, Y. Yang, J. Zhao, S. C. Hui, H. Shin, J. Shin, and G. Yu, “Collecting and analyzing multidimensional data with local differential privacy,” in 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019, pp. 638–649. [28] Q. Ye, H. Hu, X. Meng, and H. Zheng, “Privkv: Key-value data collection with local differential privacy,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 317–331. [29] X. Gu, M. Li, Y. Cheng, L. Xiong, and Y. Cao, “{PCKV}: Locally differentially private correlated key-value data collection with optimized utility,” in 29th {USENIX}Security Symposium ({USENIX}Security 20), 2020, pp. 967–984. [30] X. Ren, C. Yu, W. Yu, S. Yang, X. Yang, J. A. McCann, and P. S. Yu, “LoPub: Highdimensional crowdsourced data publication with local differential privacy,” IEEE transactions on Information Forensics and Security, vol. 13, no. 9, pp. 2151–2166, 2018. [31] Z. Li, T. Wang, M. Lopuhaä-Zwakenberg, N. Li, and B. Škoric, “Estimating numerical distributions under local differential privacy,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 621–635. [32] P. C. M. Arachchige, P. Bertok, I. Khalil, D. Liu, S. Camtepe, and M. Atiquzzaman, “Local differential privacy for deep learning,” IEEE Internet of Things Journal, vol. 7, no. 7, pp. 5827–5842, 2019. [33] Y. Zhu, X. Yu, M. Chandraker, and Y.X. Wang, “Private-knn: Practical differential privacy for computer vision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11854–11862. [34] P. Schoppmann, L. Vogelsang, A. Gascón, and B. Balle, “Secure and scalable document similarity on distributed databases: Differential privacy to the rescue,” Proceedings on Privacy Enhancing Technologies, vol. 2020, no. 2, pp. 209–229, 2020. [35] N. Li, M. Lyu, D. Su, and W. Yang, “Differential privacy: From theory to practice,” Synthesis Lectures on Information Security, Privacy, Trust, vol. 8, no. 4, pp. 1–138, 2016. [36] F. McSherry and K. Talwar, “Mechanism design via differential privacy,” in 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 2007, pp. 94–103. [37] C. Yin, B. Zhou, Z. Yin, and J. Wang, “Local privacy protection classification based on human centric computing,” Human centric Computing and Information Sciences, vol. 9, no. 1, pp. 1–14, 2019. [38] Q. Xue, Y. Zhu, and J. Wang, “Joint distribution estimation and naïve bayes classification under local differential privacy,” IEEE Transactions on Emerging Topics in Computing, 2019. [39] E. Yilmaz, M. AlRubaie, and J. M. Chang, “Locally differentially private naive bayes classification,” arXiv preprint arXiv:1905.01039, 2019. [40] W. Qardaji, W. Yang, and N. Li, “Differentially private grids for geospatial data,” in 2013 IEEE 29th international conference on data engineering (ICDE). IEEE,2013, pp. 757–768. [41] D. Su, J. Cao, N. Li, E. Bertino, and H. Jin, “Differentially private k-means clustering,” in Proceedings of the sixth ACM conference on data and application security and privacy, 2016, pp. 26–37. [42] N. Metropolis and S. Ulam, “The monte carlo method,” Journal of the American statistical association, vol. 44, no. 247, pp. 335–341, 1949. [43] W. Fan, J. He, M. Guo, P. Li, Z. Han, and R. Wang, “Privacy preserving classification on local differential privacy in data centers,” Journal of Parallel and Distributed Computing, vol. 135, pp. 70–82, 2020. [44] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is“nearest neighbor'meaningful?” in International conference on database theory. Springer, 1999, pp.217–235. [45] O. Kramer, Dimensionality reduction with unsupervised nearest neighbors. Springer, 2013. [46] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011. [47] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml [48] J. AlcaláFdez, A. Fernández, J. Luengo, J. Derrac, S. García, L. Sánchez, and F. Herrera, “Keel data mining software tool: data set repository, integration of algorithms and experimental analysis framework.”Journal of MultipleValued Logic Soft Computing, vol. 17, 2011. [49] R. Marimont and M. Shapiro, “Nearest neighbour searches and the curse of dimensionality,” IMA Journal of Applied Mathematics, vol. 24, no. 1, pp. 59–70, 1979. [50] X. Xiong, S. Liu, D. Li, Z. Cai, and X. Niu, “A comprehensive survey on local differential privacy,” Security and Communication Networks, vol. 2020, 2020. [51] G. Cormode, S. Jha, T. Kulkarni, N. Li, D. Srivastava, and T. Wang, “Privacy at scale: Local differential privacy in practice,” in Proceedings of the 2018 International Conference on Management of Data, 2018, pp. 1655–1658. [52] B. Bebensee, “Local differential privacy: a tutorial,” arXiv preprint arXiv:1907.11908, 2019. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80229 | - |
dc.description.abstract | 分類是機器學習(machine learning)中的一個核心問題,而K-最近鄰居分類法(k-nearest neighbors algorithm)因其簡單性而成為分類演算法的絕佳嘗試。但是,將K-最近鄰居分類法應用於敏感資料時,必須考量隱私問題。研究人員開發了本地端差分隱私 (local differential privacy) 框架,這是敏感資料分析的實質標準,以提供一種嚴格的方式來保護用戶的隱私。該框架通過在將資料發送到資料伺服器之前於本地擾亂客戶端的數據來保護用戶的隱私。而在K-最近鄰居分類問題中使用本地端差分隱私的概念,我們可以為用戶提供可靠的隱私保證,同時伺服器端仍然能從的敏感資料中收集有用的統計資訊。 盡我們所知,之前沒有關於本地端差分隱私下的K-最近鄰居分類法的相關研究。因此,在本論文中,我們提出了LDPKNN,一種符合本地端差分隱私的K-最近鄰居分類演算法。我們將K-最近鄰居分類問題重新表述為頻率估計(frequency estimation) 任務,並應用一致性(consistency)的後處理技術來實現在本地端差分隱私下分類。各種評估指標和資料集的實驗結果表明,我們提出的方法優於直觀的解決方案,並在隱私與效用性取得良好的平衡。 | zh_TW |
dc.description.provenance | Made available in DSpace on 2022-11-24T03:02:54Z (GMT). No. of bitstreams: 1 U0001-0907202114592200.pdf: 2379018 bytes, checksum: d6a69ab4e5af8544dae591950d53d3b3 (MD5) Previous issue date: 2021 | en |
dc.description.provenance | Item reinstated by admin ntu (admin@lib.ntu.edu.tw) on 2024-02-22T09:33:38Z Item was in collections: 電機工程學系 (ID: 1c1c5c41-70be-41d2-9d08-6256e0538c45) No. of bitstreams: 1 U0001-0907202114592200.pdf: 2379018 bytes, checksum: d6a69ab4e5af8544dae591950d53d3b3 (MD5) | en |
dc.description.tableofcontents | 口試委員審定書i 誌謝iii 摘要v Abstract vii Contents ix List of Figures xiii List of Tables xv List of Algorithms xvii Chapter 1 Introduction 1 1.1 k-Nearest Neighbors Classification . . . . . . . . . . . . . . . . . . 1 1.2 Privacy at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Differential Privacy to the Rescue . . . . . . . . . . . . . . . . . . . 2 1.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Related Works 7 2.1 Local Differential Privacy and its Applications . . . . . . . . . . . . 7 2.2 Privacy-Preserving k-Nearest Neighbors Algorithm . . . . . . . . . . 9 Chapter 3 Preliminaries 11 3.1 Nearest Neighbors Classification . . . . . . . . . . . . . . . . . . . . 11 3.1.1 k-Nearest Neighbors Algorithm . . . . . . . . . . . . . . . . 11 3.1.2 Radius Neighbors Algorithm . . . . . . . . . . . . . . . . . . 12 3.2 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.1 Basic Mechanisms for Differential Privacy . . . . . . . . . . 15 3.3 Local Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Frequency Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Direct Encoding . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4.2 Unary Encoding . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.3 Hash Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 PostProcessing for Consistency . . . . . . . . . . . . . . . . . . . . 27 Chapter 4 Proposed Methods 29 4.1 The Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Naive Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 The Laplace Mechanism . . . . . . . . . . . . . . . . . . . . 32 4.2.2 Generalized Randomized Response . . . . . . . . . . . . . . 33 4.3 LDPKNN Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.1 Encoding and Perturbation Mechanism . . . . . . . . . . . . 34 4.3.2 PostProcessing for Consistency . . . . . . . . . . . . . . . . 37 4.3.3 Reformation from k-NN to r-N Classifier . . . . . . . . . . . 39 4.3.4 Main Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter 5 Experiments 47 5.1 Environment and Datasets . . . . . . . . . . . . . . . . . . . . . . . 47 5.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.3 Number of Neighbors Versus Utility . . . . . . . . . . . . . . . . . . 50 5.4 Privacy Budget Versus Utility . . . . . . . . . . . . . . . . . . . . . 55 Chapter 6 Conclusions and Future Works 59 Bibliography 61 | |
dc.language.iso | en | |
dc.title | 符合本地端差分隱私的最近鄰居分類法 | zh_TW |
dc.title | k-Nearest Neighbors Under Local Differential Privacy | en |
dc.date.schoolyear | 109-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 顏嗣鈞(Hsin-Tsai Liu),雷欽隆(Chih-Yang Tseng),陳英一,游家牧 | |
dc.subject.keyword | 本地端差分隱私,K-最近鄰居分類法,頻率估計,機器學習, | zh_TW |
dc.subject.keyword | local differential privacy,k-nearest neighbors algorithm,frequency estimation,machine learning, | en |
dc.relation.page | 67 | |
dc.identifier.doi | 10.6342/NTU202101363 | |
dc.rights.note | 同意授權(限校園內公開) | |
dc.date.accepted | 2021-07-27 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-0907202114592200.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 2.32 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。