離群值自動檢測系統應用於時雨量資料品管

Hway-Min Chou; 周卉敏

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8443

標題:	離群值自動檢測系統應用於時雨量資料品管 An Automated Outlier Detection System for Hourly Rainfall Data Quality Control
作者:	Hway-Min Chou 周卉敏
指導教授:	鄭克聲(Ke-Sheng Cheng)
關鍵字:	時雨量,資料品質控制,離群值檢測,主成分分析,K-Means分群法, Hourly Precipitation,Data Quality Control,Anomaly Detection,Principal Component Analysis,K-Means Clustering Analysis,
出版年 :	2020
學位:	碩士
摘要:	近十年來，資料品質保證越來越受到水文領域的重視。有了品質良好的降雨資料，才能確保使用它們進行水文應用相關的風險分析及決策管理時獲得可靠的研究結果。臺灣中央氣象局管理著一個由超過600個氣象站組成的自動雨量計網路系統，每日提供即時降雨觀測。有時一個雨量站觀測到的降雨量會明顯高於或低於附近其他測站的降雨量觀測值，由於相鄰測站的降雨量往往高度相關，這可能表示異常值存在於這些觀測值中。為了控制降雨資料的品質，我們必須將這些異常值區分出來。然而，目前為止，我們缺乏明確的標準以有效地判別。在本研究中，我們運用統計方法以建立一個自動時雨量的異常值檢測系統。首先，我們根據臺灣四種常見的降雨類型的雨季，將收集到的時雨量資料分為四組。接著利用K-Means分群法對欲研究的雨量站按其地理位置和不同的降雨特性進行分群。然後，我們分別對每一種降雨類型的每一群進行主成分分析，計算出前幾個主成分，並建立一個表示降雨量資料異常程度的指標。一旦某個測站的降雨量觀測值符合我們定義異常的指標，我們便可以立刻找出可能發生異常值的測站。最後，我們建立了自動離群值檢測系統，並將其呈現為線上的互動式網頁。本研究的目的在於對時雨量觀測值建立一個可靠的異常值檢測系統，使我們能有效地篩選出可能發生異常值的測站，以達到時雨量資料品質控制的目標。 Data quality assurance has been receiving increasing attention in the field of hydrology in the last decade. Only high-quality data ensures data-driven risk analysis and decision-making strategies of hydrology applications. In Taiwan, the Central Weather Bureau manages an automated rain gauge network system of over 600 stations to obtain real-time precipitation observations. Occasionally, rainfall observations of one station are markedly higher or lower than those of nearby stations, suggesting the presence of anomalies because rainfall observations of neighboring stations are often highly correlated. To obtain reliable results based on hourly rainfall data, these anomalies should be identified in advance. However, there is a lack of definite criteria for effectively identifying anomalies. In this study, we established an automated anomaly detection system for precipitation observations. First, we categorized the data into four groups according to the four fundamental storm types in Taiwan (frontal rain, Meiyu, convective storms, and typhoons). Second, we adopted K-means clustering analysis to classify all rain gauge stations of interest by their geographical location and rainfall characteristics. For each cluster, principal component analysis was conducted to acquire the first few principal components, aiming to construct an index representing the extent of anomalies. Once the criteria are determined, identifying anomalies is straightforward. Eventually, we established the detection system and presented it as an online interactive web page. Thus, in this study, a dependable anomaly detection system was created for effectively screening out possible anomalies to achieve hourly rainfall data quality control.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8443
DOI:	10.6342/NTU202001579
全文授權:	同意授權(全球公開)
顯示於系所單位：	統計碩士學位學程

文件中的檔案：

檔案	大小	格式
U0001-1607202017300100.pdf	3.26 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。