Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90355
Title: | 以疑似物分析結合預測模型評估個人於 環境汙染物之暴露 Integrating Suspect Screening Analysis with Response Modeling to Assess Personal Exposure to a Wide Range of Chemical Mixtures |
Authors: | 陳禾翰 Ho-Han Chen |
Advisor: | 蔡詩偉 Shih-Wei Tsai |
Keyword: | 疑似物分析,反應模型,隨機森林,交叉驗證,個人暴露, suspect screening analysis,response modeling,random forest,personal exposure,cross validation, |
Publication Year : | 2023 |
Degree: | 碩士 |
Abstract: | 在環境和職業健康科學領域中,傳統上習慣使用目標物分析方法來了解樣本中的特定化合物;然而,現實生活中的個人暴露通常同時接觸多種化學混合物而不是單一化學物質。為了解決這個限制,過去文獻中已開發出一種稱為疑似物篩查分析(suspect screening analysis, SSA)的方法,以偵測更廣泛範圍的化學物質。不過儘管此方法具有優勢,疑似物篩查分析的一個主要挑戰是其缺乏提供定量資訊的能力。因此,近年來有研究開始根據化合物的物化性質開發反應建模方法(response modeling),以彌補疑似物篩查分析的不足。
本研究旨在基於化學物質的物化性質開發一個預測模型,同時減少預測的誤差。研究中首先建立一個包含259種常在環境中出現的化學物質(包括:鄰苯二甲酸酯、農藥、香氛物質、多環芳烴類和抗紫外線劑等)的化學物質庫,之後從資料庫中選擇53種化學物質進行建模,並將這些化學物質隨機分配到訓練集和測試集中;而為了分析這些化合物,本研究建立一個氣相層析質譜儀(gas chromatography-mass spectrometry, GC-MS)的分析方法與條件。本研究的資料庫中包含以不同物化特性作為參數的潛在自變量,而應變量則為氣相層析串聯質譜儀之反應訊號,本研究發現大多數化學物質呈現良好的線性關係(R2 > 0.98);另外,本研究同時也建立了儀器穩定性的標準。 本研究使用多元線性迴歸、Ridge迴歸、Lasso迴歸、Elastic Net迴歸和隨機森林迴歸等各種演算法進行建模,並使用5折交叉驗證法進行模型調整和評估。本研究最終選擇並開發了一個隨機森林迴歸模型(R2 = 0.71),其使用沸點、分子量、極化性、蒸氣壓、碰撞截面、LogKow、LogKoa和亨利常數作為預測因子,而預測模型的平均預測誤差(prediction error)為1.19。 本研究將所開發的方法應用於矽膠手環的分析,一共檢測出22種化學物質,並估計其濃度:如樣本檢測出camphor,其在矽膠手環上的濃度為4.33μg/g wristband呈現。透過本研究所建立的預測模型,未來於分析環境樣本時將可同步提供疑似物之定性及定量資訊。 In the field of environmental and occupational health sciences, targeted analyses have traditionally been used to identify specific compounds in collected samples. However, real-life exposures often involve simultaneous exposure to multiple chemical mixtures rather than individual chemicals. To address this limitation, a method called suspect screening analysis (SSA) has been developed to identify a broader range of compounds. However, despite the advantages of this method, a major challenge of suspect screening analysis is its lack of quantitative information. Therefore, in recent years, research has begun to develop response modeling methods based on the physicochemical properties of compounds to make up for the drawback of SSA. This study aimed to develop a predictive model based on the physicochemical properties of chemicals while reducing the error in prediction. A library of 259 chemicals (including phthalates, pesticides, fragrances, polycyclic aromatic hydrocarbons, UV filters, etc.) commonly found in the environment was established. A subset of 53 chemicals from this library was selected for modeling purposes. These chemicals were randomly assigned to either a training or test set. A gas chromatography-mass spectrometry (GC-MS) analytical method was developed to analyze these compounds. Several physical-chemical parameters were selected as potential independent variables, while the responses of gas chromatography-tandem mass spectrometry (GC-MS/MS) from triplicate injections served as the dependent variable. Most of the chemicals in the modeling set exhibited good linearity (R2>0.98). Criteria for instrument stability were also established. This study used various algorithms such as multiple linear regression, ridge regression, lasso regression, elastic net regression, and random forest regression for modeling and performed a 5-fold cross-validation method for model tuning and evaluation. Ultimately, a random forest regression model (R2=0.71) was selected and developed, using boiling point, molecular weight, polarizability, vapor pressure, collision cross-section, LogKow, LogKoa, and Henry's law constant as predictors to estimate chemical responses within a prediction error of 1.19. To the best of our knowledge, it is better than any similar study reported previously. Applying the developed method to the analysis of silicone wristbands, a total of 22 chemical substances were detected and their concentrations were estimated. For example, camphor was detected in a sample, and the estimated concentration was 4.33 μg/g wristband. Through the prediction model established in this study, the qualitative and quantitative information of suspected substances can be simultaneously provided when analyzing environmental samples in the future. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90355 |
DOI: | 10.6342/NTU202302456 |
Fulltext Rights: | 未授權 |
Appears in Collections: | 環境與職業健康科學研究所 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-111-2.pdf Restricted Access | 1.26 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.