利用網路封包分析進行洋蔥網路瀏覽器設定偵測

Chun-Ming Chang; 張均銘

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86093

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	蕭旭君(Hsu-Chun Hsiao)
dc.contributor.author	Chun-Ming Chang	en
dc.contributor.author	張均銘	zh_TW
dc.date.accessioned	2023-03-19T23:36:31Z	-
dc.date.copyright	2022-09-14
dc.date.issued	2022
dc.date.submitted	2022-09-08
dc.identifier.citation	[1] Configuration editor for firefox. https://support.mozilla.org/en-US/kb/about-config-editor-firefox. [2] scikit-learn. https://scikit-learn.org/stable/. [3] selenium page load documentation. https://www.selenium.dev/documentation/en/webdriver/page_loading_strategy/. [4] Attacks on tor. https://github.com/Attacks-on-Tor/Attacks-on-Tor, February 2020. [5] How to shut firefox up about updates. https://support.mozilla.org/en-US/questions/1289766, June 2020. [6] Tor metrics. https://metrics.torproject.org/userstats-relay-country.html, June 2020. [7] Tor mtrics. https://metrics.torproject.org/networksize.html, June 2022. [8] A. Aksoy and M. H. Gunes. Operating system classification performance of tcp/ip protocol headers. In 2016 IEEE 41st Conference on Local Computer Networks Workshops (LCN Workshops), pages 112–120, 2016. [9] S. Bhat, D. Lu, A. Kwon, and S. Devadas. Var-cnn: A data-efficient website fingerprinting attack based on deep learning. Proceedings on Privacy Enhancing Technologies, 2019:292–310, 10 2019. [10] X. Cai, R. Nithyanand, and R. Johnson. Cs-buflo: A congestion sensitive website fingerprinting defense. In Proceedings of the 13th Workshop on Privacy in the Electronic Society, WPES ’14, page 121–130, New York, NY, USA, 2014. Association for Computing Machinery. [11] X. Cai, R. Nithyanand, T. Wang, R. Johnson, and I. Goldberg. A systematic approach to developing and evaluating website fingerprinting defenses. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, page 227–238, New York, NY, USA, 2014. Association for Computing Machinery. [12] X. Cai, X. C. Zhang, B. Joshi, and R. Johnson. Touching from a distance: Website fingerprinting attacks and defenses. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS ’12, page 605–616, New York, NY, USA, 2012. Association for Computing Machinery. [13] C.-M. Chang, H.-C. Hsiao, T. Lynar, and T. Mori. Know your victim: Tor browser setting identification via network traffic analysis. In Companion Proceedings of the Web Conference 2022, WWW ’22, page 201–204, New York, NY, USA, 2022. Association for Computing Machinery [14] G. Cherubin, R. Jansen, and C. Troncoso. Online website fingerprinting: Evaluating website fingerprinting attacks on tor in the real world. [15] Cloudflare. Understanding Cloudflare Under Attack mode. https://support.cloudflare.com/hc/en-us/articles/200170076-Understanding-Cloudflare-Under-Attack-mode-advanced-DDOS-protection-July 2022. [16] A. Cuzzocrea, F. Martinelli, F. Mercaldo, and G. Vercelli. Tor traffic analysis and detection via machine learning techniques. In 2017 IEEE International Conference on Big Data (Big Data), pages 4474–4480, 2017. [17] F. DevTools. har-export-trigger. https://github.com/firefox-devtools/har-export-trigger, May 2022. [18] R. Dingledine, N. Mathewson, and P. F. Syverson. Tor: The second-generation onion router. In USENIX Security Symposium, 2004. [19] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani. Characterization of encrypted and vpn traffic using time-related. In Proceedings of the 2nd international conference on information systems security and privacy (ICISSP), pages 407–414, 2016. [20] T. Ede, R. Bortolameotti, A. Continella, J. Ren, D. Dubois, M. Lindorfer, D. Choffnes, M. Steen, and A. Peter. Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traffic. 01 2020. [21] T. Ede, R. Bortolameotti, A. Continella, J. Ren, D. Dubois, M. Lindorfer, D. Choffnes, M. Steen, and A. Peter. Flowprint: Semi-supervised mobile-app fingerprinting on encrypted network traffic. 01 2020. [22] N. M. R. S. T. Engel. IoT Device Fingerprinting: Machine Learning based Encrypted Traffic Analysis. In IEEE Wireless Communications and Networking Conference, 2019. [23] J. Gong and T. Wang. Zero-Delay Lightweight Defenses against Website Fingerprinting. USENIX Association, USA, 2020. [24] J. Hayes and G. Danezis. k-fingerprinting: A robust scalable website fingerprinting technique. In 25th USENIX Security Symposium (USENIX Security 16), pages 1187–1203, Austin, TX, Aug. 2016. USENIX Association. [25] N. P. Hoang, A. A. Niaki, J. Dalek, J. Knockel, P. Lin, B. Marczak, M. CreteNishihata, P. Gill, and M. Polychronakis. How great is the great firewall? measuring china’s DNS censorship. In 30th USENIX Security Symposium (USENIX Security21), pages 3381–3398. USENIX Association, Aug. 2021. [26] R. Jansen, M. Juarez, R. Galvez, T. Elahi, and C. Diaz. Inside job: Applying traffic analysis to measure tor from within. 01 2018. [27] M. Juarez, S. Afroz, G. Acar, C. Diaz, and R. Greenstadt. A critical evaluation of website fingerprinting attacks. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2014. [28] M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright. Wtf-pad: Toward an efficient website fingerprinting defense for tor. 12 2015. [29] A. Kwon, M. AlSabah, D. Lazar, M. Dacier, and S. Devadas. Circuit fingerprinting attacks: Passive deanonymization of tor hidden services. In Proceedings of the 24th USENIX Conference on Security Symposium, SEC’15, page 287–302, USA, 2015. USENIX Association. [30] S. Li, H. Guo, and N. Hopper. Measuring information leakage in website fingerprinting attacks and defenses. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18, page 1977–1992, New York, NY, USA, 2018. Association for Computing Machinery [31] D. G. Michael Schwarz, Florian Lackner. JavaScript Template Attacks: Automatically Inferring Host Information for Targeted Exploits. In Proceedings of Network and Distributed System Security Symposium, 2019. [32] mikeperry. A critique of website traffic fingerprinting attacks. https://blog.torproject.org/critique-website-traffic-fingerprinting-attacks/, November 2013. [33] R. Overdorf, M. Juarez, G. Acar, R. Greenstadt, and C. Diaz. How unique is your .onion? an analysis of the fingerprintability of tor onion services. 08 2017. [34] A. Panchenko, F. Lanze, A. Zinnen, M. Henze, J. Pennekamp, K. Wehrle, and T. Engel. Website fingerprinting at internet scale. 02 2016. [35] A. Panchenko, L. Niessen, A. Zinnen, and T. Engel. Website fingerprinting in onion routing based anonymization networks. In Proceedings of the 10th Annual ACM Workshop on Privacy in the Electronic Society, WPES ’11, page 103–114, NewYork, NY, USA, 2011. Association for Computing Machinery. [36] V. L. Pochat, T. V. Goethem, S. Tajalizadehkhoob, M. Korczynski, and W. Joosen. Tranco: A research-oriented top sites ranking hardened against manipulation. In Proceedings 2019 Network and Distributed System Security Symposium. Internet Society, 2019. [37] M. S. Rahman, P. Sirinam, N. Mathews, K. Gangadhara, and M. Wright. Tik-tok: The utility of packet timing in website fingerprinting attacks. Proceedings on Privacy Enhancing Technologies, 2020:5–24, 07 2020 [38] V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen. Automated website fingerprinting through deep learning. 02 2018. [39] S. D. Siby, M. Juárez, C. Díaz, N. Vallina-Rodriguez, and C. Troncoso. Encrypted dns -> privacy? a traffic analysis perspective. ArXiv, abs/1906.09682, 2020. [40] P. Sirinam, M. Imani, M. Juarez, and M. Wright. Deep fingerprinting: Undermining website fingerprinting defenses with deep learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18, page 1928–1943, New York, NY, USA, 2018. Association for Computing Machinery. [41] A. Sivanathan, D. Sherratt, H. H. Gharakheili, A. Radford, C. Wijenayake, A. Vishwanath, and V. Sivaraman. Characterizing and classifying iot traffic in smart cities and campuses. In 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pages 559–564, 2017. [42] T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg. Effective attacks and provable defenses for website fingerprinting. In Proceedings of the 23rd USENIX Conference on Security Symposium, SEC’14, page 143–157, USA, 2014. USENIX Association. [43] T. Wang and I. Goldberg. Improved website fingerprinting on tor. In Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society, WPES’13, page 201–212, New York, NY, USA, 2013. Association for Computing Machinery. [44] T. Wang and I. Goldberg. On realistically attacking tor with website fingerprinting. Proceedings on Privacy Enhancing Technologies, 2016(4):21–36, 2016. [45] T. Wang and I. Goldberg. Walkie-talkie: An efficient defense against passive website fingerprinting attacks. In Proceedings of the 26th USENIX Conference on Security Symposium, SEC’17, page 1375–1390, USA, 2017. USENIX Association. [46] webfp. tor-browser-selenium. https://github.com/webfp/tor-browser-selenium, June 2020
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86093	-
dc.description.abstract	隨著網路封包分析技術的進步，一些學者將封包分析的技術運用在洋蔥網路上並提出兩種新攻擊，分別是網站辨識 (Website Fingerprinting, WF) 跟洋蔥網路匿名服務辨識 (Hidden Service Fingerprinting, HSF)。這兩種攻擊能夠讓攻擊者透過監聽使用者到洋蔥網路之間的封包，利用封包傳遞的統計資料和機器學習的演算法來預測使用者瀏覽的網站或洋蔥網路服務。這兩種攻擊對使用者的隱私匿名性造成威脅。但 WF 和 HSF 目前仍然不構成大威脅，因為諸如瀏覽器的設定，網路狀況，使用者瀏覽習慣，這些原因都會影響到網站的網路封包的統計資料進而影響 WF和 HSF 的準確率。之前的研究 [27] 觀察到洋蔥網路瀏覽器版本會影響到封包的特徵，並提出提升 WF 和 HSF 準確率的條件就是攻擊者必須跟使用者使用相同的瀏覽器版本。基於以上的觀察，我們更進一步研究洋蔥網路瀏覽器的設定對於WF 的影響。我們著重在於瀏覽器版本以及安全性設定這兩項瀏覽器設定進行研究。我們提出了一種基於網路封包分析的方法來辨識使用者的瀏覽器設定的攻擊 (BSF)，並實做了一個分類器透過網路封包特徵來偵測使用者的瀏覽器版本和安全性設定。 BSF 的一個特色是，因為使用者不會很頻繁的更動瀏覽器的設定，因此 BSF 能夠透過多次的偵測來提高單一使用者瀏覽器設定的偵測準確度。在封閉世界的假設中，BSF 分類器能夠在使用者瀏覽七個網站後有高達 99% 機率能夠正確判斷瀏覽器版本，然後在開放世界的假設中則是需要使用者瀏覽 59 個網站後才能夠有 99%準確率。關於安全性設定，在封閉世界假設中，使用者瀏覽 19 個網站後會有 99%機率正確判斷安全性設定，在開放世界假設中，單次的安全性設定偵測只有 60%準確率。最後我們進行封包特徵的分析找到重要的網路封包特徵，和研究洋蔥網路瀏覽器的更新紀錄跟安全性設定對瀏覽器功能影響，並且分析 BSF 預測中，準確率前 10 高跟準確率前 10 低的網站和這些網站的特色。我們發現最高級別的安全設定會把許多瀏覽器送出的請求擋掉，但預設跟第二級別的安全設定中，我們沒辦法從網路封包特徵中看出安全性設定造成的差異。在瀏覽器版本的分析中，我們發現單純從瀏覽器發出的請求類別看不太出不同版本間是否有顯著的差異，但可以從瀏覽網站中平均傳遞跟收到的位元數目看到一些不同版本間微小的差異。根據我們的研究，我們提供洋蔥網路瀏覽器開發者跟一般的網站開發者一些抵抗 BSF 攻擊的建議。第一個方法是採用目前研究中針對 WF 的防禦方法，因為目前 WF 的防禦研究是隱藏原本網站的封包特徵，這樣的方法也適合拿來做為 BSF 攻擊的防禦機制。另一種方案是網站開發者可以採用一些針對大流量攻擊防護的服務，這些服務通常會產生一些具混淆性的封包跟暫時性的跳轉頁面，而額外的混淆性的封包跟跳轉頁面可以拿來隱藏原有網站的封包特徵，因而無法讓 BSF 的分類器蒐集到足夠的封包特徵和資訊做分類。	zh_TW
dc.description.abstract	The advance in Network Traffic Analysis (NTA) techniques has introduced new lines of de-anonymization attacks [4] against the Tor network, inclusive of Website Fingerprinting(WF) and Hidden Service Fingerprinting (HSF). These attacks can identify which regular websites or Tor hidden services a Tor user visited by using machine learning algorithms to analyze network traffic, thus undermining the privacy protection provided by Tor. However, WF and HSF are far from practical in the real world [27, 34] because the real-world traffic trace may be affected by not only the visited websites but other factors such as browser settings, network conditions, and users’ access patterns. For example, previous work [27] observed that using different Tor browser versions might affect the network traffic and argued that one challenge in constructing an effective WF adversary is to ensure the adversary has the same browser version as the user’s. Inspired by their observation, in this work, we investigate the impact of browser settings, including the browser versions and security levels, on WF in the Tor network. After confirming that browser settings have substantial impacts on WF, we present a new NTA branch called Browser Setting Fingerprinting (BSF) and construct classifiers to identify a user’s Tor browser version and security level. Interestingly, unlike WF and HSF, BSF can improve classification accuracy over time because users do not frequently change their browser settings. Our version classifier achieves over 99% accuracy when the user visits more than seven websites without changing the browser setting under the closed-world assumption and 59 under the open-world assumption. Our security-level classifier also achieves over 99% accuracy when the user visits 19 websites without changing settings under the closed-world assumption. However, the security-level classifier achieves 60% accuracy under the open-world assumption. Last, we conduct an in-depth and comprehensive analysis to identify the most informative features, inspect the changelogs of Tor browsers, and investigate the root cause of the most/least accurate classification results. The safest security setting level significantly influences the number of JavaScript, font, and POST requests, while the standard and safer levels have indistinguishable traffic features. For the Tor browser version, we can only observe little traffic difference from the examined request content types among browser versions, but the average number of bytes sent and received shows the TBB version has observable difference. Based on our findings, we provide recommendations for browser developers and web developers to defend against BSF, WF, and HSF in general. The first approach is adopting the WF defense. Because WF defense aims at concealing websites’ traffic features, it may be effective in BSF defense as well. For the second approach, web developers can adopt a cloud-based DDoS protection service to obfuscate the traffic patterns as it will redirect the visit to a temporary website, and this approach will be particularly useful if the attackers terminate the website visit before the DDoS protection actually redirect the visit to the targeted website because the attackers do not collect the website’s traffic patterns in this case.	en
dc.description.provenance	Made available in DSpace on 2023-03-19T23:36:31Z (GMT). No. of bitstreams: 1 U0001-2908202223582800.pdf: 984839 bytes, checksum: 6eb229875719067cb3be3b7bb228fed3 (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i Abstract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iii Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . .ix List of Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . .xi Chapter 1 introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 2 Background and Related Work. . . . . . . . . . . . . . . . . . . . .9 2.1 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Network Traffic Analysis (NTA) . . . . . . . . . . . . . . . . . . . . 11 2.3 NTA Targeting Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 Website Fingerprinting (WF) . . . . . . . . . . . . . . . . . . . . . 12 2.3.2 Threat Model and Assumptions . . . . . . . . . . . . . . . . . . . . 13 2.3.3 Traffic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.4 Defense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 3 Methodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1 Website list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Crawler Implementation . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6 Root Cause Analysis (RCA) dataset . . . . . . . . . . . . . . . . . . 24 Chapter 4 Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.1 Cross-setting Examination (RQ1) . . . . . . . . . . . . . . . . . . . 27 4.2 Version and setting classifier (RQ2) . . . . . . . . . . . . . . . . . . 28 Chapter 5 Analysis (RQ3). . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.1 Feature Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2 Changelog Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.3 Security Setting Inspection . . . . . . . . . . . . . . . . . . . . . . . 34 5.4 RCA Dataset Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.4.1 Content-type Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.4.2 Request and Response Byte Analysis . . . . . . . . . . . . . . . . . 37 Chapter 6 Discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.1 Data Collection Pitfalls in the Tor Network . . . . . . . . . . . . . . 41 6.2 Two-phase Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.3 Potential Countermeasures . . . . . . . . . . . . . . . . . . . . . . . 43 6.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.5 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chapter 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .47 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
dc.language.iso	en
dc.subject	隱私性	zh_TW
dc.subject	網站特徵追蹤	zh_TW
dc.subject	匿名性	zh_TW
dc.subject	洋蔥網路	zh_TW
dc.subject	網路封包分析	zh_TW
dc.subject	website fingerprinting	en
dc.subject	anonymity	en
dc.subject	privacy	en
dc.subject	network traffic analysis	en
dc.subject	Tor	en
dc.title	利用網路封包分析進行洋蔥網路瀏覽器設定偵測	zh_TW
dc.title	Tor Browser Setting Identification via Network Traffic Analysis	en
dc.type	Thesis
dc.date.schoolyear	110-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	Tatsuya Mori(Tatsuya Mori),Timothy Lynar(Timothy Lynar),林忠緯(Chung-Wei Lin)
dc.subject.keyword	匿名性,隱私性,網路封包分析,洋蔥網路,網站特徵追蹤,	zh_TW
dc.subject.keyword	anonymity,privacy,network traffic analysis,Tor,website fingerprinting,	en
dc.relation.page	55
dc.identifier.doi	10.6342/NTU202202954
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2022-09-12
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
dc.date.embargo-lift	2022-09-14	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-2908202223582800.pdf	961.76 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。