請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68889完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 王勝德(Sheng-De Wang) | |
| dc.contributor.author | Jones Sai-Wang Wan | en |
| dc.contributor.author | 尹世泓 | zh_TW |
| dc.date.accessioned | 2021-06-17T02:40:31Z | - |
| dc.date.available | 2017-08-24 | |
| dc.date.copyright | 2017-08-24 | |
| dc.date.issued | 2017 | |
| dc.date.submitted | 2017-08-16 | |
| dc.identifier.citation | [1] A. Haque, L. Khan and M. Baron, Semi Supervised Adaptive Framework for Classifying Evolving Data Stream. Cham: Springer International Publishing, 2015, pp. 383-394. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-18032-8_30
[2] J. C. Schlimmer and R. H. Granger, “Incremental learning from noisy data,” Machine Learning, vol. 1, no. 3, pp. 317-354, 1986. [Online]. Available: http://dx.doi.org/10.1007/BG00116895 [3] J.a. Gama, I. Žliobaitė, A.Bifet, M. Pechenizkiy, and A. Bouchachia, “A Survey on Concept Drift Adaptation,” ACM Comput. Surv., vol. 46, no. 4, pp. 44:41-44:37, Mar. 2014. [Online]. Available: http://doi.acm.org/10.1145/2523813 [4] G. Widmer and M. Kubat, “Effective Learning in Dynamic Environments by Explicit Context Tracking,” in Proceedings of the European Conference on Machine Learning, ser. ECML '93. London, UK, UK: Springer-Verlag, 1993, pp. 227-243. [Online]. Available: http://dl.acm.org/citation.cfm?id=645323.649587 [5] A. Bifet and R. Gavaldà, Learning from Time-Changing Data with Adaptive Windowing. SIAM, 2007, pp. 443-448. [Online]. Available: http://epubs.siam.org/doi/abs/10.1137/1.9781611972771.42 [6] P. Domingos and G. Hulten, “Mining high-speed data streams,” in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD '00. New York, NY, USA: ACM, 2000, pp. 71-80. [Online]. Available: http://doi.acm.org/10.1145/347090.347107 [7] J. Gama, P. Medas, G. Castillo and P. Rodrigues, Learning with Drift Detection. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 286-295. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-28645-5_29 [8] M. Baena-Garcia, J. del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavaldà, and R. Morales-Bueno, “Early drift detection method,” in Fourth International Workshop on Knowledge Discovery from Data Streams, 2006. [9] K. Nishida and K. Yamauchi, Detecting Concept Drift Using Statistical Testing., Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 264-269. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-75488-6_27 [10] D. M. dos Reis, P. Flach, S. Matwin and G. Batista, “Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD '16. New York, NY, USA: ACM, 2016, pp. 1545-1554. [Online]. Available: http://doi.acm.org/10.1145/2939672.2939836 [11] D. V. Hinkley, “Inference About the Change-Point in a Sequence of Random Variables,” Biometrika, vol. 57, no. 1, pp. 1-17, 1970. [Online]. Available: http://www.jstor.org/stable/2334932 [12] H. Mouss, D. Mouss, N. Mouss and L. Sefouhi, “Test of Page-Hinckley, an approach for fault detection in an agro-alimentary production system,” in 2004 5th Asian Control Conference (IEEE Cat. No.04EX904), vol. 2, Jul 2004, pp. 815-818 [13] V. M. A. Souza, D. F. Silva, J. Gama and G. E. A. P. A. Batista, Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency. SIAM, 2015, pp. 873-881. [Online]. Available: http://epubs.siam.org/doi/abs/10.1137/1.978161197010.98 [14] Y. Sakamoto, K. I. Fukui, J. Gama, D. Nicklas, K. Moriyama and M. Numao, “Concept Drift Detection with Clustering via Statistical Change Detection Methods,” in 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), Oct 2015, pp. 37-42. [15] L. I. Kuncheva and W. J. Faithfull, “PCA Feature Extraction for Change Detection in Multidimensional Unlabeled Data,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 1, pp. 69-80, 2014. [16] N. V. Chawla, K. W. Bowyer, L. O. Hall and W.P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Int. Res., vol. 16, no. 1, pp. 321-357, 2002. [Online]. Available: http://dl.acm.org/citation/cfm?id=1622407.1622416 [17] I. Tomek, “Two modifications of CNN, ” IEEE Transactions on Systems, Man, and, Cybernetics, vol. 6, pp. 769-772, Nov 1976. [18] G. E. A. P. A. Batista, A. L. C. Bazzan and M.C. Monard, “Balancing Training Data for Automated Annotation of Keywords: a Case Study,” in WOB, 2003. [19] W. N. Street and Y. Kim, “A streaming ensemble algorithm (SEA) for large-scale classification,” in Proceedings of Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD '01. New York, NY, USA: ACM, 2001, pp. 377-382. [Online]. Available: http://doi.acm.org/10.1145/502512.502568 [20] W. Fan, “Systematic data selection to mine concept-drifting data streams,” in Proceedings of the Tenth ACM SIGKSS International Conference on Knowledge Discovery and Data Mining, ser. KDD '04, New York, NY, USA: ACM, 2004, pp. 128-137. [Online]. Available: http://doi.acm.org/10.1145/1014052.1014069 [21] M. Harries, U. N. cse tr, and N. S. Wales, “SPLICE-2 Comparative Evaluation: Electricity Pricing,” The University of South Wales, Tech. Rep. 1999. [22] J. A. Blackard, “UCI Machine Learning Repository: Covertype Data Set,” 1998. [Online]. Available: http://archive.ics.uci.edu/ml/datasets/covertype | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68889 | - |
| dc.description.abstract | 串流資料挖掘是現今應用中常見的數據挖掘方法之一。然而,由於現實世界中的串流資料的性質,特別是概念漂移,令它具有挑戰性。為了處理概念漂移,當數據標籤不可用時,漂移檢測方法是必要的。在本文中,我們提出了一種基於統計測試的漂移檢測方法,其中以群聚演算法分群作前處理,並透過主成分分析(PCA)進行特徵提取減少資料維度以縮短執行時間。在合成和真實串流資料集的實驗結果表明,群聚前處理提高了漂移檢測和特徵提取的性能,從而提高了檢測性能,並加快了執行時間。 | zh_TW |
| dc.description.abstract | Stream data mining is one of the common data mining methods in real-world applications nowadays. However, it is challenging due to the nature of data stream in real-world, especially concept drift. To handle concept drift, drift detection method is necessary when the accessing data label is unavailable. In this paper, we propose a drift detection method based on the statistical test with clustering as preprocessing and reduce the execution time with principal component analysis (PCA) for the feature extraction method. Experiment result on synthetic and real-world streaming data show the clustering preprocessing improve the performance of the drift detection and feature extraction trade-off an insignificant performance of detection for great speed up for the execution time. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-17T02:40:31Z (GMT). No. of bitstreams: 1 ntu-106-R04921087-1.pdf: 894262 bytes, checksum: fead8d6f3e9af401c8eb3469e1301370 (MD5) Previous issue date: 2017 | en |
| dc.description.tableofcontents | Chapter 1 Introduction 1
Chapter 2 Related Work 4 2.1 Passive Solutions 4 2.2 Active Solutions 5 Chapter 3 Proposed Method 6 3.1 Feature Extraction 7 3.2 Clustering 8 3.3 Drift Detection 8 Chapter 4 Experimental Evaluation 12 4.1 Datasets 12 4.1.1 Synthetic datasets 12 4.1.2 Real-World datasets 13 4.2 Experiment 1. Performance of Preprocessing 13 4.3 Experiment 2. Performance of Proposed Method 15 4.4 Experiment 3. Execution time of Proposed Method 18 Chapter 5 Conclusion 20 REFERENCE 21 | |
| dc.language.iso | en | |
| dc.subject | 漂移檢測 | zh_TW |
| dc.subject | 概念漂移 | zh_TW |
| dc.subject | unsupervised | en |
| dc.subject | drift detection | en |
| dc.subject | stream data mining | en |
| dc.subject | concept drift | en |
| dc.title | 基於群聚及統計測試的概念漂移檢測方法 | zh_TW |
| dc.title | Concept Drift Detection Based on Pre-Clustering and Statistical Testing | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 105-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 雷欽隆(Chin-Laung Lei),蕭旭君(Hsu-Chun Hsiao),鄧惟中(Wei-Chung Teng) | |
| dc.subject.keyword | 概念漂移,漂移檢測, | zh_TW |
| dc.subject.keyword | concept drift,stream data mining,drift detection,unsupervised, | en |
| dc.relation.page | 24 | |
| dc.identifier.doi | 10.6342/NTU201702576 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2017-08-17 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
| 顯示於系所單位: | 電機工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-106-1.pdf 未授權公開取用 | 873.3 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
