Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80229
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor郭斯彥(Sy-Yen Kuo)
dc.contributor.authorYao-Wei Paien
dc.contributor.author白曜瑋zh_TW
dc.date.accessioned2022-11-24T03:02:54Z-
dc.date.available2021-08-06
dc.date.available2022-11-24T03:02:54Z-
dc.date.copyright2021-08-06
dc.date.issued2021
dc.date.submitted2021-07-26
dc.identifier.citation[1] P.­-N. Tan, M. Steinbach, and V. Kumar, Introduction to data mining. Pearson Education India, 2016. [2] C. M. Bishop, Pattern recognition and machine learning. springer, 2006. [3] T. J. Cleophas and A. H. Zwinderman, Machine learning in medicine-­a complete overview. Springer, 2015. [4] D. Jurafsky and J. H. Martin, Speech and Language Processing (2ndEdition). USA: Prentice­Hall, Inc., 2009. [5] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification and scene analysis. Wiley New York, 1973, vol. 3. [6] E. Fix and J. Hodges Jr, “Discriminatory analysis­-nonparametric discrimination: Consistency properties,” CALIFORNIA UNIV BERKELEY, Tech. Rep., 1951. [7] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE transactions on information theory, vol. 13, no. 1, pp. 21–27, 1967. [8] S. Hansell, “Aol removes search data on vast group of web users,” New York Times, vol. 8, no. C4, 2006. [9] A. Narayanan and V. Shmatikov, “Robust de­anonymization of large sparse datasets,” in 2008 IEEE Symposium on Security and Privacy (sp 2008). IEEE,2008, pp. 111–125. [10] P. Samarati and L. Sweeney, “Protecting privacy when disclosing information: k­anonymity and its enforcement through generalization and suppression,” SRI Com­puter Science Laboratory, Tech. Rep., 1998. [11] A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, “l­diversity: Privacy beyond k­anonymity,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, no. 1, pp. 3–es, 2007. [12] N. Li, T. Li, and S. Venkatasubramanian, “t­closeness: Privacy beyond k­anonymityand l­diversity,” in 2007 IEEE 23rd International Conference on Data Engineering. IEEE, 2007, pp. 106–115. [13] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography conference. Springer, 2006, pp. 265–284. [14] C. Dwork, A. Roth et al., “The algorithmic foundations of differential privacy.”Foundations and Trends in Theoretical Computer Science, vol. 9, no. 3­4, pp. 211–407, 2014. [15] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical min­imax rates,” in 2013 IEEE 54th Annual Symposium on Foundations of ComputerScience. IEEE, 2013, pp. 429–438. [16] Ú. Erlingsson, V. Pihur, and A. Korolova, “Rappor: Randomized aggregatable privacy­preserving ordinal response,” in Proceedings of the 2014 ACM SIGSA Ccon­ference on computer and communications security, 2014, pp. 1054–1067. [17] Differential Privacy Team, “Learning with privacy at scale,” Apple, Tech. Rep.,2017. [18] B. Ding, J. Kulkarni, and S. Yekhanin, “Collecting telemetry data privately,” in Ad­vances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Ben­gio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. CurranAssociates, Inc., 2017. [19] M. E. Gursoy, A. Inan, M. E. Nergiz, and Y. Saygin, “Differentially private nearest neighbor classification,” Data Mining and Knowledge Discovery, vol. 31, no. 5, pp.1544–1575, 2017. [20] T. Wang, J. Blocki, N. Li, and S. Jha, “Locally differentially private protocols for frequency estimation,” in 26th {USENIX} Security Symposium ({USENIX} Security17), 2017, pp. 729–745.[21] T. Wang, M. Lopuhaä­-Zwakenberg, Z. Li, B. Skoric, and N. Li, “Locally differen­tially private frequency estimation with consistency,” in 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23­26, 2020. The Internet Society, 2020. [22] B. H. Bloom, “Space/time trade­offs in hash coding with allowable errors,” Commu­nications of the ACM, vol. 13, no. 7, pp. 422–426, 1970. [23] S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp.63–69, 1965. [24] Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, and K. Ren, “Heavy hitter estimation over set­valued data with local differential privacy,” in Proceedings of the 2016ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 192–203. [25] T. Wang, N. Li, and S. Jha, “Locally differentially private frequent itemset mining,” in 2018 IEEE Symposium on Security and Privacy (SP), 2018, pp. 127–143 [26] T. T. Nguyên, X. Xiao, Y. Yang, S. C. Hui, H. Shin, and J. Shin, “Collecting and an­alyzing data from smart device users with local differential privacy,”arXiv preprint arXiv:1606.05053, 2016.[27] N. Wang, X. Xiao, Y. Yang, J. Zhao, S. C. Hui, H. Shin, J. Shin, and G. Yu, “Col­lecting and analyzing multidimensional data with local differential privacy,” in 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 2019, pp. 638–649. [28] Q. Ye, H. Hu, X. Meng, and H. Zheng, “Privkv: Key­-value data collection with local differential privacy,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 317–331. [29] X. Gu, M. Li, Y. Cheng, L. Xiong, and Y. Cao, “{PCKV}: Locally differentially pri­vate correlated key­-value data collection with optimized utility,” in 29th {USENIX}Security Symposium ({USENIX}Security 20), 2020, pp. 967–984. [30] X. Ren, C. Yu, W. Yu, S. Yang, X. Yang, J. A. McCann, and P. S. Yu, “LoPub: High­dimensional crowdsourced data publication with local differential privacy,” IEEE transactions on Information Forensics and Security, vol. 13, no. 9, pp. 2151–2166, 2018. [31] Z. Li, T. Wang, M. Lopuhaä­-Zwakenberg, N. Li, and B. Škoric, “Estimating numer­ical distributions under local differential privacy,” in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 621–635. [32] P. C. M. Arachchige, P. Bertok, I. Khalil, D. Liu, S. Camtepe, and M. Atiquzza­man, “Local differential privacy for deep learning,” IEEE Internet of Things Journal, vol. 7, no. 7, pp. 5827–5842, 2019. [33] Y. Zhu, X. Yu, M. Chandraker, and Y.­X. Wang, “Private-­knn: Practical differen­tial privacy for computer vision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11854–11862. [34] P. Schoppmann, L. Vogelsang, A. Gascón, and B. Balle, “Secure and scalable doc­ument similarity on distributed databases: Differential privacy to the rescue,” Pro­ceedings on Privacy Enhancing Technologies, vol. 2020, no. 2, pp. 209–229, 2020. [35] N. Li, M. Lyu, D. Su, and W. Yang, “Differential privacy: From theory to practice,” Synthesis Lectures on Information Security, Privacy, Trust, vol. 8, no. 4, pp. 1–138, 2016. [36] F. McSherry and K. Talwar, “Mechanism design via differential privacy,” in 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). IEEE, 2007, pp. 94–103. [37] C. Yin, B. Zhou, Z. Yin, and J. Wang, “Local privacy protection classification based on human­ centric computing,” Human­ centric Computing and Information Sciences, vol. 9, no. 1, pp. 1–14, 2019. [38] Q. Xue, Y. Zhu, and J. Wang, “Joint distribution estimation and naïve bayes classi­fication under local differential privacy,” IEEE Transactions on Emerging Topics in Computing, 2019. [39] E. Yilmaz, M. Al­Rubaie, and J. M. Chang, “Locally differentially private naive bayes classification,” arXiv preprint arXiv:1905.01039, 2019. [40] W. Qardaji, W. Yang, and N. Li, “Differentially private grids for geospatial data,” in 2013 IEEE 29th international conference on data engineering (ICDE). IEEE,2013, pp. 757–768. [41] D. Su, J. Cao, N. Li, E. Bertino, and H. Jin, “Differentially private k­-means cluster­ing,” in Proceedings of the sixth ACM conference on data and application security and privacy, 2016, pp. 26–37. [42] N. Metropolis and S. Ulam, “The monte carlo method,” Journal of the American statistical association, vol. 44, no. 247, pp. 335–341, 1949. [43] W. Fan, J. He, M. Guo, P. Li, Z. Han, and R. Wang, “Privacy preserving classification on local differential privacy in data centers,” Journal of Parallel and Distributed Computing, vol. 135, pp. 70–82, 2020. [44] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is“nearest neighbor'meaningful?” in International conference on database theory. Springer, 1999, pp.217–235. [45] O. Kramer, Dimensionality reduction with unsupervised nearest neighbors. Springer, 2013. [46] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blon­del, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011. [47] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml [48] J. Alcalá­Fdez, A. Fernández, J. Luengo, J. Derrac, S. García, L. Sánchez, and F. Her­rera, “Keel data ­mining software tool: data set repository, integration of algorithms and experimental analysis framework.”Journal of Multiple­Valued Logic Soft Computing, vol. 17, 2011. [49] R. Marimont and M. Shapiro, “Nearest neighbour searches and the curse of dimen­sionality,” IMA Journal of Applied Mathematics, vol. 24, no. 1, pp. 59–70, 1979. [50] X. Xiong, S. Liu, D. Li, Z. Cai, and X. Niu, “A comprehensive survey on local differential privacy,” Security and Communication Networks, vol. 2020, 2020. [51] G. Cormode, S. Jha, T. Kulkarni, N. Li, D. Srivastava, and T. Wang, “Privacy at scale: Local differential privacy in practice,” in Proceedings of the 2018 Interna­tional Conference on Management of Data, 2018, pp. 1655–1658. [52] B. Bebensee, “Local differential privacy: a tutorial,” arXiv preprint arXiv:1907.11908, 2019.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80229-
dc.description.abstract分類是機器學習(machine learning)中的一個核心問題,而K-最近鄰居分類法(k-nearest neighbors algorithm)因其簡單性而成為分類演算法的絕佳嘗試。但是,將K-最近鄰居分類法應用於敏感資料時,必須考量隱私問題。研究人員開發了本地端差分隱私 (local differential privacy) 框架,這是敏感資料分析的實質標準,以提供一種嚴格的方式來保護用戶的隱私。該框架通過在將資料發送到資料伺服器之前於本地擾亂客戶端的數據來保護用戶的隱私。而在K-最近鄰居分類問題中使用本地端差分隱私的概念,我們可以為用戶提供可靠的隱私保證,同時伺服器端仍然能從的敏感資料中收集有用的統計資訊。 盡我們所知,之前沒有關於本地端差分隱私下的K-最近鄰居分類法的相關研究。因此,在本論文中,我們提出了LDPKNN,一種符合本地端差分隱私的K-最近鄰居分類演算法。我們將K-最近鄰居分類問題重新表述為頻率估計(frequency estimation) 任務,並應用一致性(consistency)的後處理技術來實現在本地端差分隱私下分類。各種評估指標和資料集的實驗結果表明,我們提出的方法優於直觀的解決方案,並在隱私與效用性取得良好的平衡。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:02:54Z (GMT). No. of bitstreams: 1
U0001-0907202114592200.pdf: 2379018 bytes, checksum: d6a69ab4e5af8544dae591950d53d3b3 (MD5)
Previous issue date: 2021
en
dc.description.provenanceItem reinstated by admin ntu (admin@lib.ntu.edu.tw) on 2024-02-22T09:33:38Z
Item was in collections:
電機工程學系 (ID: 1c1c5c41-70be-41d2-9d08-6256e0538c45)
No. of bitstreams: 1
U0001-0907202114592200.pdf: 2379018 bytes, checksum: d6a69ab4e5af8544dae591950d53d3b3 (MD5)
en
dc.description.tableofcontents口試委員審定書i 誌謝iii 摘要v Abstract vii Contents ix List of Figures xiii List of Tables xv List of Algorithms xvii Chapter 1 Introduction 1 1.1 k-Nearest Neighbors Classification . . . . . . . . . . . . . . . . . . 1 1.2 Privacy at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Differential Privacy to the Rescue . . . . . . . . . . . . . . . . . . . 2 1.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Related Works 7 2.1 Local Differential Privacy and its Applications . . . . . . . . . . . . 7 2.2 Privacy-Preserving k-Nearest Neighbors Algorithm . . . . . . . . . . 9 Chapter 3 Preliminaries 11 3.1 Nearest Neighbors Classification . . . . . . . . . . . . . . . . . . . . 11 3.1.1 k-Nearest Neighbors Algorithm . . . . . . . . . . . . . . . . 11 3.1.2 Radius Neighbors Algorithm . . . . . . . . . . . . . . . . . . 12 3.2 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.1 Basic Mechanisms for Differential Privacy . . . . . . . . . . 15 3.3 Local Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Frequency Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Direct Encoding . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4.2 Unary Encoding . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.3 Hash Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 PostProcessing for Consistency . . . . . . . . . . . . . . . . . . . . 27 Chapter 4 Proposed Methods 29 4.1 The Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Naive Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.1 The Laplace Mechanism . . . . . . . . . . . . . . . . . . . . 32 4.2.2 Generalized Randomized Response . . . . . . . . . . . . . . 33 4.3 LDPKNN Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.1 Encoding and Perturbation Mechanism . . . . . . . . . . . . 34 4.3.2 PostProcessing for Consistency . . . . . . . . . . . . . . . . 37 4.3.3 Reformation from k-NN to r-N Classifier . . . . . . . . . . . 39 4.3.4 Main Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter 5 Experiments 47 5.1 Environment and Datasets . . . . . . . . . . . . . . . . . . . . . . . 47 5.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.3 Number of Neighbors Versus Utility . . . . . . . . . . . . . . . . . . 50 5.4 Privacy Budget Versus Utility . . . . . . . . . . . . . . . . . . . . . 55 Chapter 6 Conclusions and Future Works 59 Bibliography 61
dc.language.isoen
dc.title符合本地端差分隱私的最近鄰居分類法zh_TW
dc.titlek-Nearest Neighbors Under Local Differential Privacyen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee顏嗣鈞(Hsin-Tsai Liu),雷欽隆(Chih-Yang Tseng),陳英一,游家牧
dc.subject.keyword本地端差分隱私,K-最近鄰居分類法,頻率估計,機器學習,zh_TW
dc.subject.keywordlocal differential privacy,k-nearest neighbors algorithm,frequency estimation,machine learning,en
dc.relation.page67
dc.identifier.doi10.6342/NTU202101363
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-07-27
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
U0001-0907202114592200.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
2.32 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved