Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99931| Title: | 利用極限梯度提升模型和多層感知器模型進行PM2.5及其離子成分預測 Using Extreme Gradient Boosting Models and Multilayer Perceptron Models to Predict PM2.5 and Its Ion Composition |
| Authors: | 洪于雅 Yu-Ya Hung |
| Advisor: | 吳章甫 Chang-Fu Wu |
| Keyword: | 極限梯度提升,多層感知器,細懸浮微粒,離子成分,空間分布, Extreme Gradient Boosting,Multilayer Perceptron,PM2.5,ionic composition,spatial variation, |
| Publication Year : | 2025 |
| Degree: | 碩士 |
| Abstract: | 細懸浮微粒(PM2.5)的暴露已被廣泛證實對人體健康造成不良影響,PM2.5對健康的影響程度會因其化學組成而異。因此,在健康風險評估中,掌握 PM2.5 中各組成成分的濃度資訊,特別是其離子成分,對於全面理解空氣污染對人體健康之潛在威脅至關重要。然而離子成分的監測資料取得不易,相關研究面臨挑戰。
本研究透過機器學習方法,預測 PM2.5中及其主要離子成分之濃度。研究蒐集2019年至2022年間臺灣28個環境部空氣品質監測站所測得之 PM2.5 濃度與其五種主要離子成分濃度,包括鉀離子 (K⁺)、鈉離子 (Na⁺)、銨離子 (NH₄⁺)、硝酸鹽(NO₃⁻) 及硫酸鹽 (SO₄²⁻)。同時整合各測站對應之時間、土地利用、氣象觀測資料及衛星遙測數據,以建構完整數據集。在模型建構方面,本研究採用極限梯度提升(Extreme Gradient Boosting, XGBoost),以及多層感知器(Multilayer Perceptron, MLP)技術,建立預測 PM2.5 及其離子成分濃度之模型,根據實際應用情境進行模型構建與測試、比較兩模型預測效能上的差異,進行全台濃度分布預測及繪圖,並根據預測結果計算離子於PM2.5中佔比與離子間之比值,以提供相關研究之應用。 實際採樣結果顯示,PM2.5 及其離子成分濃度具顯著季節變化,夏季最低、冬季最高。本研究以XGBoost 與MLP 建立之模型具一定預測準確性與跨時空泛化能力,MLP 整體效能略優。外部測試顯示, NO₃⁻ 與 PM2.5 預測表現最佳,K⁺ 與 Na⁺ 則較差。變數重要性分析指出,模型主要選擇氣象類型特徵變數;其中,XGBoost 次多選擇衛星類別特徵,而 MLP 則偏好土地利用類別,可能反映深度學習架構在學習較高階空間特徵上的優勢。主要離子 NH₄⁺、NO₃⁻、SO₄²⁻ 約佔 PM2.5 質量濃度 40 – 45%。此外,根據全台預測結果進行繪圖,可反應之空間分布結果,唯山區因缺乏監測資料預測準確性較低,不列研究分析範圍。應用機器學習模型能有效預測 PM2.5 及其主要離子成分之濃度,並可據此進行成分分析與污染物來源探討,未來具有進一步應用於空污監測與健康風險評估的潛力。 Exposure to fine particulate matter (PM2.5) has been widely confirmed to have adverse effects on human health, and the impact varies with its chemical composition. Therefore, in health risk assessment, understanding the concentration of PM2.5 components, especially ionic species, is crucial for comprehensively evaluating the potential threats of air pollution to human health. However, monitoring data for ionic components are difficult to obtain, posing challenges to related research. This study uses machine learning methods to predict the concentrations of PM2.5 and its major ionic components. Data were collected from 28 air quality monitoring stations in Taiwan between 2019 and 2022, including PM2.5 and five main ions: potassium (K⁺), sodium (Na⁺), ammonium (NH₄⁺), nitrate (NO₃⁻), and sulfate (SO₄²⁻). Time, land use, meteorological observations, and satellite remote sensing data were integrated to build a complete dataset. Models were constructed using Extreme Gradient Boosting (XGBoost) and Multilayer Perceptron (MLP), and their prediction performances were compared. Predictions were mapped for Taiwan, and ion proportions and ratios were calculated for further application. Results show significant seasonal variation with lowest concentrations in summer and highest in winter. Both XGBoost and MLP demonstrated good predictive accuracy and spatiotemporal generalization, with MLP performing slightly better. Variable importance analysis indicated that meteorological features were primarily selected; XGBoost secondly favored satellite features, while MLP preferred land use, possibly due to deep learning’s ability to capture higher-level spatial features. Major ions of NH₄⁺, NO₃⁻, and SO₄²⁻ accounted for about 40 – 45% of PM2.5 mass. Spatial distribution maps reflected prediction results, though accuracy was lower in mountainous areas due to sparse monitoring data. Machine learning models effectively predict PM2.5 and ionic concentrations and support further component analysis and pollution source investigation, with potential applications in air quality monitoring and health risk assessment. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99931 |
| DOI: | 10.6342/NTU202504257 |
| Fulltext Rights: | 同意授權(限校園內公開) |
| metadata.dc.date.embargo-lift: | 2028-08-08 |
| Appears in Collections: | 環境與職業健康科學研究所 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-113-2.pdf Restricted Access | 10.08 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
