學習針對即將發佈的影像資料之隱私保護嵌入

Chu-Chen Li; 李筑真

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74555

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林守德(Shou-De Lin)
dc.contributor.author	Chu-Chen Li	en
dc.contributor.author	李筑真	zh_TW
dc.date.accessioned	2021-06-17T08:42:26Z	-
dc.date.available	2021-08-13
dc.date.copyright	2019-08-13
dc.date.issued	2019
dc.date.submitted	2019-08-07
dc.identifier.citation	[1] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang. Deep Learning with Differential Privacy. In Proceedings of the 23rd ACM SIGSAC Conference on Computer and Communications Security, pages 308–318, 2016. [2] G. Ateniese, G. Felici, L. V. Mancini, A. Spognardi, A. Villani, and D. Vitali. Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers. In International booktitle of Security and Networks, volume 10, pages 137–150, September 2015. [3] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. Info-GAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pages 2180–2188, 2016. [4] Y. Chen, C. Shen, S. Huang, and H. Lee. Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only. abs/1803.10952, 2018. [5] E. Decencire, X. Zhang, G. Cazuguel, B. Lay, B. Cochener, C. Trone, P. Gain, R. Ordonez, P. Massin, A. Erginay, B. Charton, and J.-C. Klein. Feedback on a publicly distributed database: the Messidor database. In Image Analysis & Stereology, volume 33, pages 231–234, August 2014. [6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In the IEEE Conference on Computer Vision and Pattern Recognition, 2009. [7] C. Dwork. Differential Privacy. In The 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), volume 4052 of Lecture Notes in Computer Science, pages 1–12, July 2006. [8] M. Fredrikson, S. Jha, and T. Ristenpart. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1322–1333, October 2015. [9] Y. Ganin and V. S. Lempitsky. Unsupervised Domain Adaptation by Backpropagation. In the 32nd International Conference on Machine Learning, 2015. [10] I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. In Proceedings of the International Conference on Learning Representations, 2015. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, June 2016. [12] B. Hitaj, G. Ateniese, and F. P´erez-Cruz. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. In Proceedings of the 24th ACM SIGSAC Conference on Computer and Communications Security, pages 603–618, 2017. [13] G. Huang, Z. Liu, and K. Q.Weinberger. Densely Connected Convolutional Networks. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 2261–2269, 2017. [14] J. Kim, Y. Park, G. Kim, and S. J. Hwang. SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization. In Proceedings of the 34th International Conference on Machine Learning, volume 70, pages 1866–1874, August 2017. [15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, volume 1, pages 1097–1105, 2012. [16] Y. LeCun and C. Cortes. MNIST handwritten digit database. 2010. [17] Y. Liu, Z. Wang, H. Jin, and I. Wassell. Multi-Task Adversarial Network for Disentangled Feature Learning. In the IEEE Conference on Computer Vision and Pattern Recognition, June 2018. [18] Y. Long, V. Bindschaedler, L. Wang, D. Bu, X. Wang, H. Tang, C. A. Gunter, and K. Chen. Understanding Membership Inferences on Well-Generalized Learning Models. abs/1802.04889, 2018. [19] L. Melis, C. Song, E. D. Cristofaro, and V. Shmatikov. Inference Attacks Against Collaborative Learning. abs/1805.04049, 2018. [20] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. In the IEEE Conference on Computer Vision and Pattern Recognition, November 2016. [21] N. I. of Health (NIH). Random Sample of NIH Chest X-ray Dataset. https://www.kaggle.com/nih-chest-xrays/sample, 2017. Accessed: 2019-04-16. [22] N. Papernot, M. Abadi, lfar Erlingsson, I. Goodfellow, and K. Talwar. Semisupervised Knowledge Transfer for Deep Learning from Private Training Data. In Proceedings of the International Conference on Learning Representations, 2017. [23] N. Papernot, S. Song, I. Mironov, A. Raghunathan, K. Talwar, and ´U. Erlingsson. Scalable Private Learning with PATE. In Proceedings of the International Conference on Learning Representations, 2018. [24] N. Phan, Y. Wang, X. Wu, and D. Dou. Differential Privacy Preservation for Deep Auto-Encoders: an Application of Human Behavior Prediction. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, pages 1309–1316, 2016. [25] R. Poplin, A. V. Varadarajan, K. Blumer, Y. Liu, M. V. McConnell, G. S. Corrado, L. H. Peng, and D. R. Webster. Predicting Cardiovascular Risk Factors from Retinal Fundus Photographs using Deep Learning. In Nature Biomedical Engineering, 2018. [26] R. Shokri and V. Shmatikov. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1310–1321, 2015. [27] K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the International Conference on Learning Representations, 2015. [28] C. Song, T. Ristenpart, and V. Shmatikov. Machine Learning Models that Remember Too Much. In Proceedings of the 24th ACM SIGSAC Conference on Computer and Communications Security, pages 587–601, 2017. [29] J. Su, D. V. Vargas, and K. Sakurai. One Pixel Attack for Fooling Deep Neural Networks. In IEEE Transactions on Evolutionary Computation, pages 1–1, Month 2019. [30] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going Deeper with Convolutions. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 1–9, 2015. [31] F. Tram`er, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Stealing Machine Learning Models via Prediction APIs. In Proceedings of the 25th USENIX Conference on Security Symposium, pages 601–618, 2016. [32] M. Veale, R. Binns, and L. Edwards. Algorithms That Remember: Model Inversion Attacks and Data Protection Law. In Philosophical Transactions A: Mathematical, Physical and Engineering Sciences, volume 376, November 2018. [33] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 3462–3471, July 2017. [34] Wikipedia. Netflix Prize — Wikipedia, The Free Encyclopedia. http://en.wikipedia.org/w/index.php?title=Netflix%20Prize&oldid=842801480, 2019. [Online; accessed 03-April-2019]. [35] C. You, L. Yang, Y. Zhang, J. Liphardt, and G. Wang. Low-Dose CT via Deep CNN with Skip Connection and Network in Network. abs/1811.10564, 2018. [36] C. Zhang, S. Bengio, M. Hardt, B. Benjamin, and O. Vinyals. Understanding Deep Learning Requires Rethinking Generalization. In Proceedings of the International Conference on Learning Representations, 2017. [37] Z. Zhang, P. Luo, C. C. Loy, and X. Tang. Learning and Transferring Multi-task Deep Representation for Face Alignment. abs/1408.3967, 2014.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74555	-
dc.description.abstract	在現今的研究中，深度學習已成為一種強而有力的技術，並且在許多問題上取得重大的進展。卷積神經網路與大量的影像資料使得影像處理領域的研究快速而蓬勃地發展。然而，當人們使用深度學習的技術來解決問題時，不可避免得會遇到隱私洩露的問題。而隱私全為基本人權之一，故隱私洩漏的問題也成為我們需要攻克的議題。由於使用大規模的影像資料，使得對於原始資料與隱藏於影像中的敏感訊息的隱私洩漏問題成為必須關注的議題。因此，特別是對肉眼不好辨識的影像，例如：隱藏有性別資訊的Ｘ光圖片，我們提倡釋放出隱私保護的嵌入以取代釋出原始影像。除了避免使用者直接接觸原始影像，使用嵌入還可以用以避免用原始影像導致的特定敏感資訊的隱私洩漏風險。為了達成這樣的目的，在採用不同方法進行實驗後，「混合」模型最終被我們採用。「混合」是多目標學習模型，它採用分解特徵的概念作為核心技術，加上以特定方法先訓練出初始權重，並且以對抗示例圖作為訓練輸入。多目標網路以底層的分享層作為特徵擷取器和兩個分別解決主任務與輔助任務的辨別器組合而成。特徵擷取器和輔助任務辨別器進行對抗過程來優化輔助任務。我們在MAFL人臉資料集和NIH提供的胸腔Ｘ光圖資料集進行實驗。其結果展露出由混合模型擷取器生成的嵌入可以成功地預測主要任務且可以將指定的敏感資訊去除。更甚之，我們發現將混合模型得來的嵌入加上叉分隱私技術可以得到更好的表現。	zh_TW
dc.description.abstract	Deep learning is a powerful technique which make a great process in solving different problems. The usage of convolutional neural network and massive image data let researches about image processing develop rapidly. However, when deep learning is utilized, the problems about privacy leakage need to be concerned simultaneously. Due to large-scale image data, privacy preserving for original data and sensitive information hidden on it is essential. Therefore, we purpose releasing privacy-preserving embeddings, especially for images which sensitive information is invisible to the naked eye, e.g., X-ray images with gender information hidden behind, replacing to original image data. The embeddings are able to avoid privacy leakage of original image data and specific sensitive information. To reach our goal, after conducting several methods, hybrid model is purposed finally. Hybrid is a multitask-learning model for disentangling features with good initial weights by iterative training and adversarial examples as inputs. The multitask network composes of some shared layers on the bottom as a feature extractor and two discriminators respectively for a main task and a sensitive task. The feature extractor and the sensitive discriminator conduct an adversarial process to optimize sensitive loss. The experiments on fatial database MAFL and medical image database NIH Chest X-ray demonstrate that embeddings generated by the hybrid extractor can predict the main task with designated sensitive information being wiped out. Moreover, it is discovered that hybrid model with differential privacy leads to a better performance.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T08:42:26Z (GMT). No. of bitstreams: 1 ntu-108-R06946006-1.pdf: 5400412 bytes, checksum: fa11be7ba927eb2db95d0dd14d3d9180 (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	口試委員會審定書 i Acknowledgments ii Abstract iii List of Figures x List of Tables xii Chapter 1 Introduction 1 Chapter 2 Background and Related Work 5 2.1 Privacy Preserving on Machine Learning . . . . . . . . . . . . . . . . 6 2.2 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Adversarial Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Feature Disentangling . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 3 Privacy Preserving on Data 12 3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.1 Privacy-preserving Embeddings Learning . . . . . . . . . . . . 14 3.2 Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 4 Methodology 17 4.1 Random Labels Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Iterative Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.3 Feature Disentangling . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Adversarial Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.5 Our Proposed Method: Hybrid . . . . . . . . . . . . . . . . . . . . . 21 Chapter 5 Experiments 22 5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.1.1 Multi-Attribute Facial Landmark (MAFL) Dataset . . . . . . 23 5.1.2 Random Samples of NIH Chest X-rays Dataset . . . . . . . . 23 5.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 Result Analysis and Discussion . . . . . . . . . . . . . . . . . . . . . 25 5.3.1 Experiments on MAFL Dataset . . . . . . . . . . . . . . . . . 25 5.3.2 Experiments on Random Samples of NIH Chest X-rays Dataset 26 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.4.1 Influence of Correlation between Tasks . . . . . . . . . . . . . 32 5.4.2 Validation of Information Removing . . . . . . . . . . . . . . . 32 5.4.3 Comparison on Different Models . . . . . . . . . . . . . . . . . 34 5.4.4 Influence of Model Complexity . . . . . . . . . . . . . . . . . . 38 5.5 Embeddings with Differential Privacy . . . . . . . . . . . . . . . . . . 41 Chapter 6 Conclusions 42 Chapter 7 Future Work 44 Bibliography 46
dc.language.iso	en
dc.subject	隱私保護	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	醫學影像	zh_TW
dc.subject	差分隱私	zh_TW
dc.subject	Deep Learning	en
dc.subject	Medical Images	en
dc.subject	Privacy Preserving	en
dc.subject	Differential Privacy	en
dc.title	學習針對即將發佈的影像資料之隱私保護嵌入	zh_TW
dc.title	Learning Privacy-preserving Embeddings for Image Data to Be Published	en
dc.type	Thesis
dc.date.schoolyear	107-2
dc.description.degree	碩士
dc.contributor.coadvisor	葉彌妍(Mi-Yen Yeh)
dc.contributor.oralexamcommittee	陳銘憲(Ming-Syan Chen),林軒田(Hsuan-Tien Lin),蔡銘峰(Ming-Feng Tsai)
dc.subject.keyword	醫學影像,深度學習,隱私保護,差分隱私,	zh_TW
dc.subject.keyword	Medical Images,Deep Learning,Privacy Preserving,Differential Privacy,	en
dc.relation.page	50
dc.identifier.doi	10.6342/NTU201902724
dc.rights.note	有償授權
dc.date.accepted	2019-08-07
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資料科學學位學程	zh_TW
顯示於系所單位：	資料科學學位學程

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 未授權公開取用	5.27 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。