Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 公共衛生學院
  3. 流行病學與預防醫學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90333
標題: 探討台灣敗血症病患28天存活相關基因
Identification of genetic variants associated with the 28-day survival in Taiwanese sepsis patients
作者: 吳珮瑄
Pei-Hsuan Wu
指導教授: 盧子彬
Tzu-Pin Lu
關鍵字: 敗血症,28天死亡,全基因組關聯研究,存活分析,多基因風險評分,
Sepsis,28-day mortality,Genome-wide association study,survival analysis,polygenic risk score,
出版年 : 2023
學位: 碩士
摘要: 研究背景:
敗血症定義為病原體進入人體後,若免疫系統對其產生過度反應,且無法自主調節體內免疫功能時,引起器官衰竭和死亡的症候群。早期研究指出台灣敗血症發生率和死亡率有隨著年齡增加而上升的趨勢,目前臨床以相繼器官衰竭評估分數或血液檢測數值輔助判斷患者的治療策略,然而敗血症患者之間異質性高,即使進行相同治療,仍可能出現預後不佳的情形。先前有研究透過全基因組關聯研究(Genome-wide association study)提出位於調控體內免疫功能基因的單核苷酸多型性(single-nucleotide polymorphisms) 與敗血症患者死亡風險有關。由於這些研究大部分聚焦於非亞洲族群,被提出與敗血症死亡相關的基因尚未在亞洲族群進行驗證,因此本研究採用台灣敗血症病患資料進行分析,包含三個主要目的,首先找出與28天敗血症存活顯著相關位點。接著為了降低臨床實務的基因分析時間,以納入100個以內的位點估計多基因風險分數(polygenetic risk score),並檢測其對模型預測能力的影響。最後透過多基因風險分數將敗血症病患區分為高死亡風險與低死亡風險組。預期利用該方法可加速辨別28天高死亡風險病患,提早治療介入時間。
材料與方法:
由2017年1月至2021年12月從台中榮總醫院收集3,847位首次診斷敗血症的患者,排除基因資料和臨床資料缺漏的個案,經過品質控制共納入373名敗血症患者和6,865,736個位點。研究的主要終點(primary endpoint)定義為診斷敗血症後28天內發生死亡。透過每位個案的存活時間進行分組,其中39位(10.5%)在28天內發生死亡,334位(89.5%)沒有觀察到死亡事件。以依變數的形式進行兩種全基因組關聯分析,首先利用Cox迴歸分析與28天敗血症存活時間相關之基因變異,接著將連續型的存活時間以28天作為切點,轉換至二元變項,以邏輯斯迴歸分析28天敗血症死亡風險相關之基因變異。同時收集過去六篇非亞洲族群研究,觀察被提出的顯著位點是否能在台灣族群中進行驗證。接著從373名敗血症患者隨機抽出80% (n=299)作為原始資料集(base dataset),進行全基因組關聯分析,根據C + T方法(剪枝與p值篩選),依據每個人基因座攜帶的風險等位基因數量加權由Cox迴歸和邏輯斯迴歸估計的效應大小後,估計多基因風險分數,稱為PRS-sepsis;並由過去六篇非亞洲族群研究提出與罹患敗血症或敗血症死亡相關的顯著位點, 估計另一多基因風險分數,稱為PRS-sepsis-pre。透過五倍交叉驗證檢測模型的預測能力。首先加入PRS變數建立基準模型(baseline model),並控制年齡、性別和主成分。接著選入在單變數Cox迴歸顯著的臨床特徵建立多變數模型(multivariate model)。最後為了降低多變數模型的複雜度,利用最小絕對壓縮挑選操作迴歸(簡稱Lasso迴歸)和逐步迴歸,篩選合適的臨床變數建立最終模型(final model)。所有納入PRS的模型與只有臨床變數的傳統模型進行比較,評估加入基因變數對模型預測能力的影響。若以Cox迴歸建立預測模型,將由一致性指數評估模型預測能力;若由邏輯斯迴歸建立預測模型,則以ROC曲線下面積(area under the curve) 評估模型預測能力。並依據PRS-sepsis分數將病患進行28天死亡風險分層,利用Kaplan-Meier分析高風險和低風險組別是否具有顯著差異。
研究結果:
透過Cox迴歸校正年齡、性別和主成分進行全基因組關聯研究分析,共有4個顯著位點(p < 5×10-8),過去在非亞洲族群研究觀察到134個與罹患敗血症或敗血症死亡相關位點,其中64個可在本研究樣本中進行驗證,但皆未達統計顯著。利用C + T篩選留下86個SNPs估計PRS-sepsis。另外透過過去文獻指出的64個顯著位點估計PRS-sepsis-pre,並觀察其在台灣敗血症病患的應用。由結果顯示,納入PRS-sepsis的多變數模型C-index顯著高於傳統模型。接著由Lasso迴歸以及逐步迴歸針對多變數模型篩選潛在重要的臨床變數並建構最終模型。其模型的C-index與傳統模型相比,於訓練集顯著提升5.85%和9.3% (p < 0.05),而在測試集分別顯著提升15.18%和15.23% (p < 0.05);另外檢測加入PRS-sepsis-pre,以及分別透過Lasso迴歸和逐步迴歸篩選的臨床變數建構最終模型後,與傳統模型相比,模型C-index於訓練集上升1.57%和1.92%,而在測試集上升3.04%和0.33%。然而這些預測能力沒有達到顯著差異(p > 0.05)。將PRS-sepsis分數由低至高排序,以67%作為切點,分數最高的33% (前三分之一)病患分類至高分組,其餘為低分組,由統計結果顯示兩組的存活風險存在顯著差異(p < 0.0001)。若以邏輯斯迴歸校正年齡、性別和主成分透過全基因組關聯研究分析,沒有任何位點達顯著,但仍有19個位點達潛在相關(p < 1×10-5)。過去非亞洲族群研究提出的134個顯著位點在台灣敗血症族群中皆未顯示顯著相關。然而有59個SNPs與台灣敗血症病患的位點重疊,其中58個同時出現在原始資料集,並依據這些SNPs估計PRS-sepsis-pre。透過C + T篩選方式留下94個SNPs估計PRS-sepsis。當模型藉由Lasso迴歸和逐步迴歸留下合適的臨床變數後,加入PRS-sepsis建構最終模型,模型預測能力AUC比傳統模型分別增加11.5%和11.7%,但皆未達統計顯著差異(p > 0.05)。若由Lasso迴歸留下合適的臨床變數後,加入PRS-sepsis-pre建構最終模型,模型預測能力AUC與傳統模型相同;然而臨床變數若由逐步迴歸進行篩選,最終模型的AUC比傳統模型增加2.1%,但未達統計顯著差異(p > 0.05)。
結論:
敗血症屬於綜合性症狀,患者可能同時伴隨不同症狀,若未即時接受合適治療,出現多重器官衰竭後,可能導致死亡。過去研究指出臨床特徵、宿主與病原體的交互作用、治療方式與特定基因,使敗血症患者之間出現預後差異。因此本研究以全基因組關聯研究,觀察到4個顯著與存活時間相關的位點,並進一步利用過去非亞洲族群相關研究提出的顯著位點進行分析,然而根據結果顯示這些位點無法在台灣敗血症患者中觀察到與存活時間顯著相關;其中透過本研究族群發現的4個達顯著的位點,亦無重疊於過去非亞洲族群研究提出的位點。透過千人基因組計畫發現這4個顯著位點,其中3個在美洲、非洲、歐洲和南亞洲的次要等位基因頻率皆為0,推測這些位點可能只出現在亞洲地區敗血症患者。當模型考慮基因變數後,預測能力高於傳統單純由臨床變數建構的模型。若透過多基因風險評分方式,將敗血症患者區分為高、低分數組,結果顯示兩組的死亡風險具有顯著差異。因此患者若被診斷為敗血症,可立即透過標的基因檢測區分死亡風險,輔助臨床端提供合適治療策略,提升患者預後。
Background:
Sepsis occurs when the immune system has an extreme response to infection by pathogens, and consequently leads to organ dysfunction and even death if the immune system cannot regulate itself normally. A previous study indicated that the incidence and mortality rate of sepsis increased with the increasing age among sepsis patients in Taiwan. The Sequential Organ Failure Assessment (SOFA) score and/or blood test results are used to make treatment decisions in the clinic. However, due to the high heterogeneity among sepsis patients, even when using the same treatments, patients suffer deterioration. A series of studies have identified the relationships between single-nucleotide polymorphisms (SNPs) and the mortality of sepsis using genome-wide association studies (GWAS). These SNPs were located in genes that regulate immune-related function. However, most of these studies focused on non-Asian populations. The reported SNPs have not been validated in Asian sepsis patients. Therefore, this study used the data from Taiwanese sepsis patients. This study had three main objectives. Firstly, we aimed to identify the significant SNPs associated with 28-day survival in sepsis patients. Secondly, to expedite the time of conducting gene analysis in clinical settings, we limited the number of SNPs used for calculating the PRS to less than 100. Prediction models were then constructed to evaluate the impact of including the PRS. Thirdly, stratifying patients based on their PRS, they were categorized into high-risk or low-risk groups for 28-day mortality. This approach held potential for discerning high-risk of 28-day mortality among sepsis patients early and facilitating timely intervention.
Methods and materials:
A cohort of 3,847 patients with a first diagnosis of sepsis at Taichung Veterans General Hospital from January 2017 to December 2021 were included as participants. Patients without genotype data or clinical data were excluded. After quality control, there were 373 sepsis patients and 6,865,736 SNPs. The primary endpoint of the study was death due to sepsis within a 28-day follow-up period. According to the survival time to classify patients, of 39 (10.5%) were with events and the rest 334 (89.5%) were without observed events in this study. The GWAS was conducted in two ways, depending on the class of the dependent variable. First, the GWAS was performed using Cox regression to identify the SNPs associated with survival time. Second, survival time with continuous format was converted to binary (1, 0) based on death within a 28-day follow-up period. Logistic regression was performed in the GWAS to identify the significant SNP associated with sepsis mortality. Six studies that reported the significant SNPs in non-Asians were selected for validation in the Taiwanese sepsis patients. Sampling 80% of the whole samples to be base dataset (n=299) was used to perform GWAS to obtain the effect sizes of the SNPs. After the C + T (clumping and thresholding) method was used to select the SNPs, individuals' PRS were estimated as PRS-sepsis using the number of risk alleles for each variant weighted by its effect size. Conversely, SNPs previously identified as significantly associated with sepsis incidence or mortality were combined to create a second polygenic risk score, PRS-sepsis-pre. Five-fold cross validation was used to verify the predictive power of the models. The baseline model was constructed by adding the polygenic risk score (PRS) and adjusting for age, sex, and principal components (PCs). Subsequently, the multivariate model was built by incorporating the clinical variables selected through univariate regression into the baseline model. Finally, to address the complexity of the multivariate model, the least absolute shrinkage and selection operator (Lasso) regression and stepwise regression were employed to select the appropriate clinical variables and construct the final model. The performances of all models with the incorporation of PRS was compared against the traditional models that included only clinical variables, in order to assess the improvement in model performance by incorporating genetic predictors. The concordance index (C-index) and the area under the curve (AUC) represented the predictive power of the model when constructed using Cox regression and logistic regression, respectively. Stratified by PRS-sepsis, Kaplan-Meier analyses was conducted on high-risk and low-risk individuals to evaluate whether the risks of 28-day survival differed between two groups.
Results:
A total of 4 SNPs were genome-wide significantly associated with survival time using Cox regression, adjusting for age, sex, and PCs (p < 5×10-8). Previous studies identified 134 significant (p < 1×10-5) SNPs, 64 of which were observed in Taiwanese sepsis patients. However, none were significantly associated with 28-day survival. The C + T thresholds allowed a total of 86 SNPs to be aggregated for PRS-sepsis construction. In addition, 64 significant SNPs from previous studies were pooled to create the PRS-sepsis-pre and tested for applicability to sepsis patients from Taiwan. The result was observed that the inclusion of PRS-sepsis in the multivariate model resulted in a significant improvement in the C-index compared to the traditional model. After applying Lasso regression and stepwise regression to select potentially important clinical variables in the multivariate model, the final models were constructed. Compared to the traditional model, the C-index exhibited a significant increase of 5.85% and 9.3% in the training set, respectively (p <0.05). In the testing set, the C-index also showed a significant increase of 15.18% and 15.23% (p <0.05). Furthermore, incorporating PRS-sepsis-pre and the clinical variables chosen by Lasso regression and stepwise regression into the final model resulted in a 1.57% and 1.92% increase of the C-index in the training set, as well as a 3.04% and 0.33% increase of the C-index in the testing set. However, all of the predictive powers of the final models were not significantly different from those of the traditional models (p > 0.05). The sepsis patients improved one standard deviation of PRS-sepsis, the probability of the mortality would increase nearly 4% when considering the clinical features. Based on the ascending PRS-sepsis values, patients in the top 33%, the highest one-third, were classified as the high PRS group, while the remaining patients were classified as the low PRS group. The hazard functions were significantly different between the two groups (p < 0.0001). Even though none of the SNPs were genome-wide significant using logistic regression, 11 SNPs were suggestively associated with sepsis mortality adjusting for age, sex and PCs (p < 1×10-5). A total of 134 SNPs from prior studies were deemed significant, but none of them showed significance in sepsis patients in Taiwan. Out of these, only 59 SNPs were observed in Taiwanese sepsis patients. Among them, 58 SNPs were replicated in base dataset and utilized to generate the PRS-sepsis-pre. Applying a C + T threshold, 94 SNPs were selected and aggregated as PRS-sepsis. Upon adding PRS-sepsis to the model along with the clinical variables retained through Lasso regression and stepwise regression, the final model exhibited an improvement of 11.5% and 11.7% in AUC; however, these improvements were not statistically significant compared to the traditional models (p > 0.05). When incorporating PRS-sepsis-pre into the final model along with the clinical variables selected via Lasso regression, the AUC of the final model remained the same as that of the traditional model. On the other hand, when the clinical variables were retained using Lasso regression, the AUC of the final model improved by 2.1% when compared to the traditional model. However, this improvement was not statistically significant (p > 0.05).
Conclusion:
Sepsis is a syndrome with diverse symptoms. Patients with sepsis develop organ dysfunction and then death if they could not receive the appropriate treatments. The prior study reported that the heterogeneity in sepsis is related to clinical features, host-pathogen interactions, treatments, and/or genetic predisposition, leading to the differences in prognosis among patients. In this retrospective study, we identified 4 SNPs significantly associated with 28-day survival. None of the significant SNPs from the previous studies were replicated among sepsis patients in Taiwan. In addition, the four SNPs identified in our data did not overlap with the SNPs reported in previous studies. Additionally, among the Americans, Africans, Europeans, and South Asians in the 1000 Genomes phase 3 dataset reference panel, the minor allele frequencies of three out of the four SNPs were zero. Therefore, these significant SNPs may only appear in Asian population. Comparing to the traditional model, adding polygenic risk scores improved the predictive power of the models. The high-PRS and low-PRS subgroups well discriminated the risk of sepsis mortality. In summary, the use of these significant genes in the diagnosis of sepsis can help clinicians to determine the subgroup at risk and guide therapy, thereby improving outcomes.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90333
DOI: 10.6342/NTU202301905
全文授權: 同意授權(全球公開)
電子全文公開日期: 2025-09-01
顯示於系所單位:流行病學與預防醫學研究所

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
  此日期後於網路公開 2025-09-01
3.16 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved