Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89131
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張瑞益zh_TW
dc.contributor.advisorRay-I Changen
dc.contributor.author李孟哲zh_TW
dc.contributor.authorMeng-Che Leeen
dc.date.accessioned2023-08-16T17:15:40Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-16-
dc.date.issued2023-
dc.date.submitted2023-08-09-
dc.identifier.citation[1] M. N. Bhuiyan, M. M. Rahman, M. M. Billah and D. Saha, "Internet of Things (IoT): A Review of Its Enabling Technologies in Healthcare Applications, Standards Protocols, Security, and Market Opportunities," in IEEE Internet of Things Journal, vol. 8, no. 13, pp. 10474-10498, 1 July1, 2021, doi: 10.1109/JIOT.2021.3062630.
[2] Y. Liang and W. Peng, “Minimizing energy consumptions in wireless sensor networks via two-modal transmission,” ACM SIGCOMM Computer Communication Review, vol. 40, no. 1, pp. 12–18, 2010. doi:10.1145/1672308.1672311
[3] C. -L. Lin, J. -H. Tsai, Y. -H. Chu and R. -I. Chang, "Concept of bounded error to improve wireless sensor network data compression," SENSORS, 2014 IEEE, Valencia, Spain, 2014, pp. 1240-1243, doi: 10.1109/ICSENS.2014.6985234
[4] Y. -H. Chen, N. -Y. Huang, Y. -H. Chu, M. -H. Li, R. -I. Chang and C. -H. Wang, "Dynamic bounded-error data compression and aggregation in wireless sensor network," SENSORS, 2012 IEEE, Taipei, Taiwan, 2012, pp. 1-4, doi: 10.1109/ICSENS.2012.6411224.
[5] M. -H. Li, C. -C. Lin, C. -C. Chuang and R. -I. Chang, "Error-Bounded Data Compression Using Data, Temporal and Spatial Correlations in Wireless Sensor Networks," 2010 International Conference on Multimedia Information Networking and Security, Nanjing, China, 2010, pp. 111-115, doi: 10.1109/MINES.2010.31.
[6] European Commission. General data protection regulation, 2018.
[7] R.-I. Chang, L.-C. Wei, C.-H. Wang, and Y.-K. Tseng, “Blockchain for bounded-error-pruned content protection,” ICT Express, vol. 7, no. 3, pp. 295–299, 2021. doi:10.1016/j.icte.2021.08.013.
[8] Z. Zheng et al., “An overview on smart contracts: Challenges, advances and platforms,” Future Generation Computer Systems, vol. 105, pp. 475–491, 2020. doi:10.1016/j.future.2019.12.019
[9] Sompolinsky, Y. and Zohar, A. (2015) ‘Secure high-rate transaction processing in bitcoin’, Financial Cryptography and Data Security, pp. 507–527. doi:10.1007/978-3-662-47854-7_32.
[10] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning,” ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 1–19, 2019. doi:10.1145/3298981
[11] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," J. Mach. Learn. Res., vol. 3, no. 3, pp. 1157-1182, Mar. 2003.
[12] A. Banga, R. Ahuja, and S. C. Sharma, “Performance analysis of regression algorithms and feature selection techniques to predict PM2.5 in smart cities,” International Journal of System Assurance Engineering and Management, 2021. doi:10.1007/s13198-020-01049-9
[13] A. Altmann, L. Toloşi, O. Sander, and T. Lengauer, "Permutation importance: a corrected feature importance measure," Bioinformatics, vol. 26, no. 10, pp. 1340-1347, May 2010.
[14] B. M. Greenwell, B. C. Boehmke, and A. J. McCarthy, "A Simple and Effective Model-Based Variable Importance Measure," arXiv preprint arXiv:1805.04755, 2018.
[15] U. Grömping, “Variable importance in regression models,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 7, no. 2, pp. 137–152, 2015. doi:10.1002/wics.1346
[16] M. Arif and R. S. Anand, “Run length encoding for speech data compression,” 2012 IEEE International Conference on Computational Intelligence and Computing Research, 2012. doi:10.1109/iccic.2012.6510185
[17] D. Huffman, “A method for the construction of minimum-redundancy codes,” Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952. doi:10.1109/jrproc.1952.273898.
[18] W. S. Noble, “What is a support vector machine?,” Nature Biotechnology, vol. 24, no. 12, pp. 1565–1567, 2006. doi:10.1038/nbt1206-1565
[19] R. A. FISHER, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179–188, 1936. doi:10.1111/j.1469-1809.1936.tb02137.x.
[20] S. P. Lloyd, "Least squares quantization in PCM," Bell Telephone Laboratories Paper, vol. 36, no. 7, pp. 611-656, 1957.
[21] Scikit-learn. (n.d.). Iris Dataset. Retrieved December, 2022, Available: https://scikitlearn.org/stable/modules/generated/sklearn.datasets.load_iris.html.
[22] Candanedo, L. M., Feldheim, V., & Deru, M. (2016). Room Occupancy Detection Data Set. Kaggle. Available: https://www.kaggle.com/datasets/sachinsharma1123/room-occupancy
[23] UCI Machine Learning Repository. (1997). Contraceptive Method Choice Dataset Kaggle. Available : https://www.kaggle.com/datasets/faizunnabi/contraceptive-method-choice
[24] H. Hancock and T. M. Khoshgoftaar, "Survey on categorical data for neural networks," J. Big Data, vol. 7, no. 1, p. 28, 2020. doi: 10.1186/s40537-020-00305-w.
[25] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
[26] M. Bora, D. Jyoti, D. Gupta, and A. Kumar, "Effect of different distance measures on the performance of K-means algorithm: an experimental study in Matlab," arXiv preprint arXiv:1405.7471, 2014.
[27] Ze-Nian Li, Mark S. Drew, and Jiangchuan Liu, "Fundamentals of Multimedia," Prentice Hall, Upper Saddle River, NJ, USA, 2003, pp. 74
[28] 陳盈秀。「SVM類神經網路於單調性資料探勘之研究」。碩士論文,國立成功大學工業與資訊管理學系專班,2009。https://hdl.handle.net/11296/wwy6zr。
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89131-
dc.description.abstract物聯網技術的出現為人類帶來許多便利的服務,透過感測節點的部署,得以收集各式資料;然而感測節點所收集資料涉及敏感資訊,因此本研究先前提出有限誤差資料(Bounded-Error Data, BED)隱私保護 (BED Privacy Protection, BEDPP) 系統架構,旨在保護用戶資料的隱私。BEDPP系統架構分為兩個子系統:有限誤差物聯網 (Bounded-Error IoT, BEIoT) 以及基於區塊鏈的有限誤差資料市集 (BED Market on Blockchain, BEDMoB)。BEIoT負責資料收集並以先前研究所提出之有限誤差壓縮演算法Bounded-Error Run Length Encoding (BERLE) 或 Bounded-Error Huffman Coding (BEHC) 壓縮資料。BEDMoB負責根據智能合約中的隱私權限,為用戶提供有限誤差資料(Bounded-Error Data, BED),達成具隱私保護之資料共享服務。BEDPP建構物聯網資料共享流程的同時,可降低節點資料傳輸量以延長物聯網裝置使用壽命。惟該架構在資料收集階段時未考量特徵於後續機器學習應用場景之重要程度,因此本研究針對BEIoT所處之資料收集階段,依據預期之機器學習目標進行特徵重要度衡量 (Feature Importance Measurement, FIM),以作為資料收集時的誤差分派方針,除提升資料壓縮之成效外更強化BED在機器學習時之可用性。研究成果顯示,相較於採等比例誤差分派方法之原BEIoT系統,加入FIM模型後平均可改善BERLE與BEHC之壓縮率,分別達12.55% 與19.97%,並提升BED於 SVM、k-NN、K-Means 之機器學習表現,分別可達 3.24%、3.44% 及26.20%。zh_TW
dc.description.abstractThe emergence of Internet of Things (IoT) technology has brought numerous convenient services. Through the deployment of sensor nodes, various types of data can be collected. As these data may involve sensitive information, our previous study proposed the Bounded-Error Data (BED) Privacy Protection (BEDPP) system framework to safeguard user data privacy. BEDPP consists of two subsystems: Bounded-Error IoT (BEIoT) and BED Market on Blockchain (BEDMoB). BEIoT is responsible for data collection and compression by our previous BERLE (Bounded-Error Run Length Encoding) or BEHC (Bounded-Error Huffman Coding) algorithms. BEDMoB ensures privacy-preserving data sharing services based on privacy permissions embedded in smart contracts of Blockchain. The BEDPP framework not only constructs a complete IoT data sharing process but also reduces the amount of data transmission to prolong the lifespan of IoT devices. However, this framework did not consider the feature importance (FI) in subsequent machine learning applications during the data collection stage. Therefore, in this study, for the data collection stage of BEIoT, FI is measured by a FI Measurement (FIM) module based on the expected objective in machine learning. FIM is used as the error allocation policy during data collection to enhance the effectiveness of data compression and the usability in machine learning. Compared to the original BEIoT system with a proportional error allocation method, the FIM model can improve the compression rates of BERLE and BEHC by 12.55% and 19.97%, respectively. Additionally, it enhances the performance of BED in SVM, k-NN, and K-Means by 3.24%、3.44% and 26.20%, respectively.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-16T17:15:40Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-16T17:15:40Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents摘要 I
ABSTRACT II
目錄 III
圖目錄 V
表目錄 VII
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 3
第二章 文獻探討 5
2.1 特徵選擇演算法 5
2.2 有限誤差壓縮演算法 5
2.3 機器學習模型 8
第三章 研究方法 10
3.1 實驗流程與架構 10
3.1.1 資料集 12
3.1.2 資料集於分群演算法與分類演算法之處理方式 13
3.2 FIM模組 13
第四章 實驗結果 17
4.1 考量FIM結果之誤差分派方法對於壓縮表現之影響 17
4.1.1 考量FIM結果對於BERLE表現之影響 18
4.1.2 考量FIM結果對於BEHC表現之影響 18
4.1.3 考量FIM結果對於BEHC與BERLE表現之比較 19
4.2 考量FIM結果之壓縮法對於機器學習表現之影響 21
4.2.1 考量FIM結果之BERLE對於機器學習表現之影響 21
4.2.2 考量FIM結果之BEHC對於機器學習表現之影響 25
4.2.3 考量FIM結果之BERLE與BEHC對於機器學習表現之比較 29
4.3 考量FIM結果對於壓縮與機器學習表現之通用準則 30
第五章 結論與未來展望 31
5.1 結論 31
5.2 未來展望 33
5.3 BERLEHC 34
5.3.1 考量FIM結果之BERLEHC壓縮表現 35
5.3.2 考量FIM結果之BERLEHC機器學習表現 36
參考文獻 39
-
dc.language.isozh_TW-
dc.subject壓縮zh_TW
dc.subject物聯網zh_TW
dc.subject機器學習zh_TW
dc.subject特徵重要度zh_TW
dc.subject有限誤差zh_TW
dc.subjectBounded-Erroren
dc.subjectMachine Learningen
dc.subjectCompressionen
dc.subjectIoTen
dc.subjectFeature Importanceen
dc.title有限誤差資料特徵重要度對壓縮及機器學習準確度之影響zh_TW
dc.titleThe Influence of Bounded-Error Data with Feature Importance on Compression and Machine Learning Accuracyen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee王家輝;尹邦嚴;丁肇隆zh_TW
dc.contributor.oralexamcommitteeChia-Hui Wang;Peng-Yeng Yin;Chao-Lung Tingen
dc.subject.keyword有限誤差,壓縮,特徵重要度,機器學習,物聯網,zh_TW
dc.subject.keywordBounded-Error,Compression,Feature Importance,Machine Learning,IoT,en
dc.relation.page42-
dc.identifier.doi10.6342/NTU202303010-
dc.rights.note未授權-
dc.date.accepted2023-08-10-
dc.contributor.author-college工學院-
dc.contributor.author-dept工程科學及海洋工程學系-
顯示於系所單位:工程科學及海洋工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
  未授權公開取用
1.9 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved