請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23782
標題: | 利用階層模式評估複雜性疾病之脈絡效應與基因效應 Applications of hierarchical modeling on the evaluation of contextual effect and genetic effect for complex disease |
作者: | Shi-Heng Wang 王世亨 |
指導教授: | 陳為堅 |
關鍵字: | 階層模式,複雜性疾病,脈絡效應,基因環境交互作用,稀有變異,貝氏統計, hierarchical model,complex disease,contextual effect,gene-environment interaction,rare variant,Bayesian analysis, |
出版年 : | 2011 |
學位: | 博士 |
摘要: | 階層架構資料(hierarchical data)在複雜性疾病(complex disease)研究中非常常見;例如,家族研究(pedigree study)或配對病例對照研究(matched case-control study)都是階層架構資料。這一類的資料有一個特點,那就是資料的變異來源並非單一,除了樣本個人之間的變異,還有不同群體之間的差異;此時,使用階層模式(hierarchical model)來分析能適度分別這些不同來源的變異。本研究利用階層模式來處理幾種不同類型的階層架構資料:﹝一﹞研究樣本有些來自不同地區,有些來自相同地區;﹝二﹞配對病例對照研究;﹝三﹞重複測量的家族研究。
有些階層架構資料來自於探討環境因子對健康影響的研究,因此,除了個人層次因子的影響,還必須考慮團體層次因子的影響(contextual effect)。個人層次的因子有個人特徵,如性別、年齡、社經地位,及個人生活習慣等;團體層次的影響則指大環境的某些暴露量對個人的健康所造成的影響。這些團體層次的因子對個人健康所造成的影響,可能是直接作用,也可能在個人層次因子與健康的相關上扮演修飾因子(modifier)的角色。本論文第一個研究即著重於探討團體層次的暴露﹝居住地區的公共休閒設施﹞與代謝症候群(metabolic syndrome)的相關。研究樣本為14658名參加健康檢查的桃園地區居民。探討的團體層次環境因子則為公共休閒設施,例如公園或是學校操場;依其在各鄉鎮的密度,從低到高分為四個類別。分析的方法則使用MLwiN software version 2.0進行多層次邏輯斯迴歸分析(multilevel logistic regression models)。結果發現,在考慮了個人層次解釋變項的影響後,團體層次環境因子可以進一步解釋7.2%代謝症候群的變異。比起公共休閒設施最低的地區,其他三種不同設施密度的地區,其符合代謝症候群診斷的勝算比及95%信賴區間分別﹝從低到高﹞為0.87 (0.71-1.07), 0.87 (0.68-1.12),及 0.78 (0.61-0.99),且趨勢檢定達統計顯著。此研究顯示,住在公共休閒設施較高的地區,符合代謝症候群診斷的風險會較低。 除了本論文第一個部分所探討的個人層次與團體層次的環境因子之外,本論文進一步在第二個部份探討不同層次因子之間的交互作用,例如基因與團體層次因子之間的交互作用、以及對疾病的影響。這種檢定交互作用項的顯著與否通常必須倚賴大樣本的統計假設來進行概似函數檢定(likelihood ratio test, LRT);然而,該方法在樣本數不夠多、觀察值具群聚性(clustered data)、或模式中參數量高的情形下,並不適用,也較不容易偵測出交互作用的存在。因此,本論文在這一部份中利用貝氏階層混合效應模式(Bayesian hierarchical mixed-effects model),針對配對病例對照研究的資料,來分析基因的作用是否會因為團體層次的暴露狀態不同而有所差異,並且以馬可夫鏈蒙地卡羅(Markov chain Monte Carlo, MCMC)方法所得的事後樣本(posterior sample)來進行統計推論。結果發現,rs1801282對代謝症候群的影響,會被地區的運動設施密度高低所修飾。此研究所使用的貝氏階層混合效應模式對於探討跨層次的基因環境交互作用可提供量性的評估。 本論文第三個部份則考慮當基因因子測量變數本身也具誤差時,例如拷貝數變異(copy number variation),如何評估其與精神分裂症的相關。由於現有實驗方法在測量拷貝數變異時會有誤差,因此常需針對同一個人的同一點位置進行多次測量。本研究收集了607個家族的資料,包括來自全台灣手足皆患有精神分裂症的病患及其一等親家屬。因此,分析時必須同時考慮是否罹患精神分裂症的反應變數在家族之內的相關性、以及重複測量拷貝數變異所產生的相關性。本研究以貝氏階層混合效應模式(Bayesian hierarchical mixed-effects model),先將連續型的拷貝數變異測量轉換為整數的拷貝數,再將不同點的拷貝數變異集合起來,探討其與疾病的相關與否。當把三個點的拷貝數變異集合起來,其基因效應的事後樣本有80%大於零,表示如果有這三個點的任一變異,比較容易有精神分裂症。這個模式對於找出稀有的致病基因有所幫助。 In association studies of complex diseases, data are usually collected in a hierarchical way. Examples are matched case-control or family-based studies. Such nesting data contain multiple sources of variability including similarity within clustered observed values and between subjects in the same category of certain variable. These sources of variation can be explained properly in a hierarchical model. In this thesis, three different types of nesting data are considered: (1) individuals from the same area share similar environmental effect, (2) the group-level environmental effect depends on the individual-level genetic factor, and (3) genetic covariates are measured repeatedly for a family-based study. The first focus of this thesis is the environmental influence measured at the individual level, such as by personal characteristics and lifestyle choices, and at the group-level (called contextual variables), such as by community factors and degree of local area development. The contextual influence on individual health status may be direct effect, or it can modify the influence of individual-level factors. The first aim of this thesis is to examine the relationship between contextual variable, the availability of public facilities for habitual physical activity in the community, and metabolic syndrome in northern Taiwan, one of the most densely populated countries in the world. Subjects consisted of 14658 participants aged 40 years or older from 10 districts of Taoyuan County in a health check-up program in 2004–2005. Public facility for habitual physical activity included school campuses and parks. Multilevel logistic regression models were created to examine the effect on metabolic syndrome at both the individual and the contextual level using MLwiN software version 2.0. The addition of the contextual variable to the model that included individual characteristics led to a further reduction of 7.2% in the variance. Using the facility density level I as the reference, the odds ratios (95% confidence interval) of metabolic syndrome for levels II, III, and IV were 0.87 (0.71-1.07), 0.87 (0.68-1.12), and 0.78 (0.61-0.99), respectively, with the trend test reaching significance. Greater availability of free facilities for habitual physical activity in a district was associated with a lower risk of metabolic syndrome among its residents. The second of the thesis is to explore the cross-level gene-environment interaction. To examine the interaction between genes and contextual factors, traditional approaches usually adopt asymptotic tests to assess whether the interaction exists. Such tests are limited as they cannot evaluate or quantify the strength of difference in genetic effects across different clusters or categories of the contextual variable. A Bayesian hierarchical mixed-effects model can examine whether the genetic effect is modified by residential environment in a matched case-control metabolic syndrome study. The group-level contextual covariate, availability of exercise facilities, contains four categories, from low to high. Based on posterior samples from Markov chain Monte Carlo (MCMC) methods, the interaction between this group-level environmental condition and the individual-level genetic effect is evaluated by differences in allelic effects under various contextual categories. The Bayesian analysis indicates that the effect of rs1801282 on metabolic syndrome development is modified by the contextual environmental factor such that, even with the same genotype, living in a residential area with low availability of exercise facility may result in higher risk. This Bayesian hierarchical mixed-effects model provides a quantitative assessment for the cross-level interaction between genes and contextual variables. The third part of this thesis focused on the association between copy number variation (CNV), which was measured repeatedly, and schizophrenia. The study sample consisted of 607 families of schizophrenia individuals and their first-degree relatives. Here the correlation arises not only between family members but also within repeated CNV measurements. This Bayesian hierarchical model assigns integers in the probabilistic sense to the repeated-measured quantitatively copy numbers, and is able to test the association simultaneously between all variants under study. When collapsing all three CNVs, the posterior probability of risk was 80%, indicating a higher risk of schizophrenia. This model is useful for scientists who consider CNV for novel target genes identification. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23782 |
全文授權: | 未授權 |
顯示於系所單位: | 流行病學與預防醫學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-100-1.pdf 目前未授權公開取用 | 1.47 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。