請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8443
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 鄭克聲(Ke-Sheng Cheng) | |
dc.contributor.author | Hway-Min Chou | en |
dc.contributor.author | 周卉敏 | zh_TW |
dc.date.accessioned | 2021-05-20T00:54:34Z | - |
dc.date.available | 2021-08-02 | |
dc.date.available | 2021-05-20T00:54:34Z | - |
dc.date.copyright | 2020-08-05 | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020-08-03 | |
dc.identifier.citation | Bock, H. (1985). On some significance tests in cluster analysis. Journal of Classification, 2(1), 77-108. doi: 10.1007/bf01908065 Bock, H. (2008). Origins and extensions of the k-means algorithm in cluster analysis. Electronic Journal for History of Probability and Statistics, 4(2), 18. Boyle, J. S., and G. T. J. Chen, (1987). Synoptic aspects of the wintertime East Asian monsoon. Monsoon Meteorology, C. P. Chang and T. N. Krishnamurti, Eds., Oxford University Press, 125–160. Branisavljević, N., Prodanović, D., Arsić, M., Simić, Z., Borota, J. (2009). Hydro-Meteorological Data Quality Assurance and Improvement. Journal Of The Serbian Society For Computational Mechanics, 3(1), 228-249. Chen, C., Chen, Y. (2003). The Rainfall Characteristics of Taiwan. Monthly Weather Review, 131(7), 1323-1341. doi: 10.1175/1520-0493(2003)131<1323:trcot>2.0.co;2 Chen, C., Huang, J. (1999). A Numerical Study of Precipitation Characteristics over Taiwan Island during the Winter Season. Meteorology And Atmospheric Physics, 70(3-4), 167-183. doi: 10.1007/s007030050032 Cox, D. (1957). Note on Grouping. Journal of the American Statistical Association, 52(280), 543-547. doi: 10.1080/01621459.1957.10501411 Fisher, W. (1958). On Grouping for Maximum Homogeneity. Journal of the American Statistical Association, 53(284), 789-798. doi: 10.1080/01621459.1958.10501479 Hartigan, J. (1975). Clustering algorithms. New York: John Wiley Sons. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6 7), 417-441 498-520. doi: 10.1037/h0071325 Hudson, H., Mcmillan, D., Pearson, C. (1999). Quality assurance in hydrological measurement. Hydrological Sciences Journal, 44(5), 825-834. doi: 10.1080/02626669909492276 James, G., Witten, D., Hastie, T., Tibshirani, R. (2015). An Introduction to Statistical Learning (6th ed., p. 426). New York: Springer Science + Business. Jolliffe, I. (2002). Principal Component Analysis. 2nd ed (2nd ed.). New York, NY: Springer-Verlag. Kassambara, A. (2020). factoextra package | R Documentation. Retrieved 29 May 2020, from https://www.rdocumentation.org/packages/factoextra Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572. doi: 10.1080/14786440109462720 Pollard, D. (1982). A Central Limit Theorem for K-Means Clustering. The Annals of Probability, 10(4), 919-926. doi: 10.1214/aop/1176993713 Rousseeuw, P., Hubert, M. (2018). Anomaly detection by robust statistics. Wires Data Mining and Knowledge Discovery, 8(2). doi: 10.1002/widm.1236 Toe, M., Kanzaki, M., Lien, T., Cheng, K. (2017). Spatial and temporal rainfall patterns in Central Dry Zone, Myanmar - A hydrological cross-scale analysis. Terrestrial, Atmospheric and Oceanic Sciences, 28(3), 425-436. doi: 10.3319/tao.2016.02.15.01(Hy) Wang, S.-T., and H. Cheng, (1982). Natural seasons as shown by the variation of the circulation in Asia (in Chinese). Atmos. Sci., 9, 125–146. You, J., Hubbard, K., Nadarajah, S., Kunkel, K. (2007). Performance of Quality Assurance Procedures on Daily Precipitation. Journal of Atmospheric and Oceanic Technology, 24(5), 821-834. doi: 10.1175/jtech2002.1 | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8443 | - |
dc.description.abstract | 近十年來,資料品質保證越來越受到水文領域的重視。有了品質良好的降雨資料,才能確保使用它們進行水文應用相關的風險分析及決策管理時獲得可靠的研究結果。臺灣中央氣象局管理著一個由超過600個氣象站組成的自動雨量計網路系統,每日提供即時降雨觀測。有時一個雨量站觀測到的降雨量會明顯高於或低於附近其他測站的降雨量觀測值,由於相鄰測站的降雨量往往高度相關,這可能表示異常值存在於這些觀測值中。為了控制降雨資料的品質,我們必須將這些異常值區分出來。然而,目前為止,我們缺乏明確的標準以有效地判別。 在本研究中,我們運用統計方法以建立一個自動時雨量的異常值檢測系統。首先,我們根據臺灣四種常見的降雨類型的雨季,將收集到的時雨量資料分為四組。接著利用K-Means分群法對欲研究的雨量站按其地理位置和不同的降雨特性進行分群。然後,我們分別對每一種降雨類型的每一群進行主成分分析,計算出前幾個主成分,並建立一個表示降雨量資料異常程度的指標。 一旦某個測站的降雨量觀測值符合我們定義異常的指標,我們便可以立刻找出可能發生異常值的測站。最後,我們建立了自動離群值檢測系統,並將其呈現為線上的互動式網頁。本研究的目的在於對時雨量觀測值建立一個可靠的異常值檢測系統,使我們能有效地篩選出可能發生異常值的測站,以達到時雨量資料品質控制的目標。 | zh_TW |
dc.description.abstract | Data quality assurance has been receiving increasing attention in the field of hydrology in the last decade. Only high-quality data ensures data-driven risk analysis and decision-making strategies of hydrology applications. In Taiwan, the Central Weather Bureau manages an automated rain gauge network system of over 600 stations to obtain real-time precipitation observations. Occasionally, rainfall observations of one station are markedly higher or lower than those of nearby stations, suggesting the presence of anomalies because rainfall observations of neighboring stations are often highly correlated. To obtain reliable results based on hourly rainfall data, these anomalies should be identified in advance. However, there is a lack of definite criteria for effectively identifying anomalies. In this study, we established an automated anomaly detection system for precipitation observations. First, we categorized the data into four groups according to the four fundamental storm types in Taiwan (frontal rain, Meiyu, convective storms, and typhoons). Second, we adopted K-means clustering analysis to classify all rain gauge stations of interest by their geographical location and rainfall characteristics. For each cluster, principal component analysis was conducted to acquire the first few principal components, aiming to construct an index representing the extent of anomalies. Once the criteria are determined, identifying anomalies is straightforward. Eventually, we established the detection system and presented it as an online interactive web page. Thus, in this study, a dependable anomaly detection system was created for effectively screening out possible anomalies to achieve hourly rainfall data quality control. | en |
dc.description.provenance | Made available in DSpace on 2021-05-20T00:54:34Z (GMT). No. of bitstreams: 1 U0001-1607202017300100.pdf: 3337868 bytes, checksum: 15ddbcbe38fa49de79b55198fe3d77be (MD5) Previous issue date: 2020 | en |
dc.description.tableofcontents | 1. Introduction 1 2. Data 6 3. Methods 9 3-1. Abnormal Situations 10 3-2. Cutoff Point 13 3-3. K-Means Clustering Analysis 14 3-4. PCA 17 4. Results 24 4-1. Anomalies Discovered Using the Cutoff Point Method 24 4-2. Results of the K-Means Clustering Method 27 4-3. Possible Anomalies Identified Using PCA 31 4-3-1. Detected Anomaly at Station Donghe on March 27, 2020 32 4-3-2. Criteria for Anomaly Detection and Nine Categories of Anomalies 43 4-3-3. The Automated Anomaly Detection System 62 Conclusion 64 References 66 Appendix 68 A. Table of 297 Stations 68 B. Table of Warning Typhoons from 1998 to 2019 78 C. Identified Hourly Rainfall Anomalies Caused by Malfunction 82 | |
dc.language.iso | en | |
dc.title | 離群值自動檢測系統應用於時雨量資料品管 | zh_TW |
dc.title | An Automated Outlier Detection System for Hourly Rainfall Data Quality Control | en |
dc.type | Thesis | |
dc.date.schoolyear | 108-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 葉惠中(Hui-Chung Yeh),溫在弘(Tzai-Hung Wen),王藝峰 (Yi-feng Wang),黃文政(Wen-Cheng Huang) | |
dc.subject.keyword | 時雨量,資料品質控制,離群值檢測,主成分分析,K-Means分群法, | zh_TW |
dc.subject.keyword | Hourly Precipitation,Data Quality Control,Anomaly Detection,Principal Component Analysis,K-Means Clustering Analysis, | en |
dc.relation.page | 88 | |
dc.identifier.doi | 10.6342/NTU202001579 | |
dc.rights.note | 同意授權(全球公開) | |
dc.date.accepted | 2020-08-04 | |
dc.contributor.author-college | 共同教育中心 | zh_TW |
dc.contributor.author-dept | 統計碩士學位學程 | zh_TW |
顯示於系所單位: | 統計碩士學位學程 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-1607202017300100.pdf | 3.26 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。