Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 公共衛生學院
  3. 流行病學與預防醫學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60691
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪弘(Hung Hung),郭柏秀(Po-Hsiu Kuo)
dc.contributor.authorYi-Hsuan Chenen
dc.contributor.author陳逸萱zh_TW
dc.date.accessioned2021-06-16T10:26:15Z-
dc.date.available2016-09-24
dc.date.copyright2013-09-24
dc.date.issued2013
dc.date.submitted2013-08-15
dc.identifier.citation1. Feuk, L.; Carson, A.R.; Scherer, S.W., Structural variation in the human genome. Nat Rev Genet 2006, 7, 85-97.
2. Zhang, F.; Gu, W.; Hurles, M.E.; Lupski, J.R., Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 2009, 10, 451-481.
3. Zollner, S.; Teslovich, T.M., Using gwas data to identify copy number variants contributing to common complex diseases. Statistical Science 2009, 24, 530-546.
4. Macgregor, S.; Visscher, P.M.; Montgomery, G., Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates. Nucleic Acids Res 2006, 34, e55.
5. Sham, P.; Bader, J.S.; Craig, I.; O'Donovan, M.; Owen, M., DNA pooling: A tool for large-scale association studies. Nat Rev Genet 2002, 3, 862-871.
6. Lin, C.H.; Huang, M.C.; Li, L.H.; Wu, J.Y.; Chen, Y.T.; Fann, C.S., Genome-wide copy number analysis using copy number inferring tool (cnit) and DNA pooling. Hum Mutat 2008, 29, 1055-1062.
7. Barnes, C.; Plagnol, V.; Fitzgerald, T.; Redon, R.; Marchini, J.; Clayton, D.; Hurles, M.E., A robust statistical method for case-control association testing with copy number variation. Nature Genetics 2008, 40, 1245-1252.
8. Cardin, N.; Holmes, C.; Donnelly, P.; Marchini, J., Bayesian hierarchical mixture modeling to assign copy number from a targeted cnv array. Genet Epidemiol 2011, 35, 536-548.
9. Zhang, J.; Liang, F., Robust clustering using exponential power mixtures. Biometrics 2010, 66, 1078-1086.
10. Gonzalez, J.R.; Subirana, I.; Escaramis, G.; Peraza, S.; Caceres, A.; Estivill, X.; Armengol, L., Accounting for uncertainty when assessing association between copy number and disease: A latent class model. BMC Bioinformatics 2009, 10, 172.
11. Tseng, G.C.; Wong, W.H., Tight clustering: A resampling-based approach for identifying stable and tight patterns in data. Biometrics 2005, 61, 10-16.
12. Sorzano, C.O.; Bilbao-Castro, J.R.; Shkolnisky, Y.; Alcorlo, M.; Melero, R.; Caffarena-Fernandez, G.; Li, M.; Xu, G.; Marabini, R.; Carazo, J.M., A clustering approach to multireference alignment of single-particle projections in electron microscopy. J Struct Biol 2010, 171, 197-206.
13. Shiu, S.-Y.; Chen, T.-L., Clustering by self-updating process. arXiv:1201.1979 [stat.ME] 2012.
14. Ting-Li Chen, H.H., I-Ping Tu, Pei-Shien Wu, Dai-Ni Hsieh, Wei-Hau Chang, Su-Yun Huang, Gamma-sup: A self-updating clustering algorithm based on minimum gamma-divergence with application to cryo-em images. arXiv:1205.2034 [stat.ME] 2013.
15. Wang, K.; Li, M.; Hadley, D.; Liu, R.; Glessner, J.; Grant, S.F.; Hakonarson, H.; Bucan, M., Penncnv: An integrated hidden markov model designed for high-resolution copy number variation detection in whole-genome snp genotyping data. Genome Res 2007, 17, 1665-1674.
16. Lin, C.H.; Lin, Y.C.; Wu, J.Y.; Pan, W.H.; Chen, Y.T.; Fann, C.S., A genome-wide survey of copy number variations in han chinese residing in taiwan. Genomics 2009, 94, 241-246.
17. Lou, H.; Li, S.; Yang, Y.; Kang, L.; Zhang, X.; Jin, W.; Wu, B.; Jin, L.; Xu, S., A map of copy number variations in chinese populations. PLoS ONE 2011, 6, e27341.
18. Malhotra, D.; McCarthy, S.; Michaelson, J.J.; Vacic, V.; Burdick, K.E.; Yoon, S.; Cichon, S.; Corvin, A.; Gary, S.; Gershon, E.S., et al., High frequencies of de novo cnvs in bipolar disorder and schizophrenia. Neuron 2011, 72, 951-963.
19. CF, K.; PH, K., Risk and information evaluation of prioritized genes for complex traits: Application to bipolar disorder
20. Wang, J.; Duncan, D.; Shi, Z.; Zhang, B., Web-based gene set analysis toolkit (webgestalt): Update 2013. Nucleic Acids Res 2013, 41, W77-83.
21. Tu, I.P., An eigenvector variability plot. Statistica Sinica 2009, 19, 1741.
22. Meinshausen, N.; Buhlmann, P., Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2010, 72, 417-473.
23. Han Liu, K.R., Larry Wasserman, Stability approach to regularization selection (stars) for high dimensional graphical models. arXiv:1006.3316 [stat.ML] 2010.
24. Monti, S.; Tamayo, P.; Mesirov, J.; Golub, T., Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning 2003, 52, 91-118.
25. Mollah, M.N.; Sultana, N.; Minami, M.; Eguchi, S., Robust extraction of local structures by the minimum beta-divergence method. Neural Netw 2010, 23, 226-238.
26. Priebe, L.; Degenhardt, F.A.; Herms, S.; Haenisch, B.; Mattheisen, M.; Nieratschker, V.; Weingarten, M.; Witt, S.; Breuer, R.; Paul, T., et al., Genome-wide survey implicates the influence of copy number variants (cnvs) in the development of early-onset bipolar disorder. Mol Psychiatry 2012, 17, 421-432.
27. Yang, S.; Wang, K.; Gregory, B.; Berrettini, W.; Wang, L.S.; Hakonarson, H.; Bucan, M., Genomic landscape of a three-generation pedigree segregating affective disorder. PLoS ONE 2009, 4, e4474.
28. Grozeva D, K.G.I.D.; et al., Rare copy number variants: A point of rarity in genetic risk for bipolar disorder and schizophrenia. Archives of General Psychiatry 2010, 67, 318-327.
29. Bergen, S.E.; O'Dushlaine, C.T.; Ripke, S.; Lee, P.H.; Ruderfer, D.M.; Akterin, S.; Moran, J.L.; Chambert, K.D.; Handsaker, R.E.; Backlund, L., et al., Genome-wide association study in a swedish population yields support for greater cnv and mhc involvement in schizophrenia compared with bipolar disorder. Mol Psychiatry 2012, 17, 880-886.
30. McQuillin, A.; Bass, N.; Anjorin, A.; Lawrence, J.; Kandaswamy, R.; Lydall, G.; Moran, J.; Sklar, P.; Purcell, S.; Gurling, H., Analysis of genetic deletions and duplications in the university college london bipolar disorder case control sample. Eur J Hum Genet 2011, 19, 588-592.
31. Siuly; Li, Y.; Wen, P.P., Clustering technique-based least square support vector machine for eeg signal classification. Comput Methods Programs Biomed 2011, 104, 358-372.
32. Ben-Shachar, S.; Lanpher, B.; German, J.R.; Qasaymeh, M.; Potocki, L.; Nagamani, S.C.; Franco, L.M.; Malphrus, A.; Bottenfield, G.W.; Spence, J.E., et al., Microdeletion 15q13.3: A locus with incomplete penetrance for autism, mental retardation, and psychiatric disorders. J Med Genet 2009, 46, 382-388.
33. Miller, D.T.; Shen, Y.; Weiss, L.A.; Korn, J.; Anselm, I.; Bridgemohan, C.; Cox, G.F.; Dickinson, H.; Gentile, J.; Harris, D.J., et al., Microdeletion/duplication at 15q13.2q13.3 among individuals with features of autism and other neuropsychiatric disorders. J Med Genet 2009, 46, 242-248.
34. Jin, G.; Sun, J.; Liu, W.; Zhang, Z.; Chu, L.W.; Kim, S.T.; Feng, J.; Duggan, D.; Carpten, J.D.; Wiklund, F., et al., Genome-wide copy-number variation analysis identifies common genetic variants at 20p13 associated with aggressiveness of prostate cancer. Carcinogenesis 2011, 32, 1057-1062.
35. Trachoo, O.; Assanatham, M.; Jinawath, N.; Nongnuch, A., Chromosome 20p inverted duplication deletion identified in a thai female adult with mental retardation, obesity, chronic kidney disease and characteristic facial features. Eur J Med Genet 2013, 56, 319-324.
36. Olsen, L.; Hansen, T.; Djurovic, S.; Haastrup, E.; Albrecthsen, A.; Hoeffding, L.K.; Secher, A.; Gustafsson, O.; Jakobsen, K.D.; Nielsen, F.C., et al., Copy number variations in affective disorders and meta-analysis. Psychiatr Genet 2011, 21, 319-322.
37. Glessner, J.T.; Reilly, M.P.; Kim, C.E.; Takahashi, N.; Albano, A.; Hou, C.; Bradfield, J.P.; Zhang, H.; Sleiman, P.M.A.; Flory, J.H., et al., Strong synaptic transmission impact by copy number variations in schizophrenia. Proceedings of the National Academy of Sciences 2010, 107, 10584-10589.
38. Sampaio, A.S.; Fagerness, J.; Crane, J.; Leboyer, M.; Delorme, R.; Pauls, D.L.; Stewart, S.E., Association between polymorphisms in grik2 gene and obsessive-compulsive disorder: A family-based study. CNS Neurosci Ther 2011, 17, 141-147.
39. Kim, S.A.; Kim, J.H.; Park, M.; Cho, I.H.; Yoo, H.J., Family-based association study between grik2 polymorphisms and autism spectrum disorders in the korean trios. Neurosci Res 2007, 58, 332-335.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60691-
dc.description.abstract拷貝數變異是一種DNA結構上的變異,近年來已有許多研究指出它與許多複雜性疾病有關。陣列式晶片技術可幫助我們快速的掃描大量拷貝數變異的訊號,也有許多新發展的統計方法嘗試從實驗偵測的訊號值估計出拷貝數。這些方法主要面臨的問題在於離散的拷貝數數值需要從一連串標記所讀出的連續訊號值來估計,進而我們還希望進行關聯性檢定來找出拷貝數變異與疾病的關係。
在拷貝數變異分析的第一階段,我們通常會由全基因體的訊號來找尋和疾病相關的拷貝數變異片段。由於拷貝數變異是一種稀少且影響力相對小的一種DNA變異,使得我們很難在病人與非病人間進行比較。近年來許多研究為了節省成本,開始採用混合樣本之全基因體掃描研究的分析策略,然而由於拷貝數變異的複雜性,此策略若要應用到拷貝數變異的偵測上,將面臨更大的挑戰。在這個研究中,我們希望能發展一套程序來幫助我們使用混合樣本來找出拷貝數變異與疾病之間的關係。我們建立一系列的篩選方法來過濾掉可能是偽陽性的結果,並將這套程序應用到躁鬱症的拷貝數變異資料中。我們先定義出每批混合樣本的拷貝數變異區段,再挑選出在病例組與對照組中有不同分佈趨勢的拷貝數變異區段,最後我們透過整合這些拷貝數變異區段所對應到的基因功能及比對過去的發表過的相關研究,來探測拷貝數變異與躁鬱症之間的關聯性。
在拷貝數變異分析的第二階段,我們可透過集群分析從特定片段所取得的驗證訊號值中估計出拷貝數。但由於拷貝數變異的資料在分群的趨勢較不明顯且有離群值,使得我們很難找出正確的分類。γ-SUP是一種新發展的方法,它能解決拷貝數變異資料面臨的問題,並且它不需要事先決定分群的組數。γ-SUP需要決定一個會影響組數的參數τ,然而該篇作者建議的主觀挑選參數的方法與分析結果的好壞並沒有確定的根據。在這個研究中,我們希望能根據穩定性的概念來發展出挑選。γ-SUP參數的方法。穩定的集群分析在於它的分群結果能可被重覆很多次,因此我們利用重複抽樣的方法測量評估穩定性的指標。根據模擬的分析證明我們提出的方法能夠找出適當的參數,進而我們將這個方法應用在自閉症的拷貝數變異資料中。
zh_TW
dc.description.abstractCopy number variation (CNV) is a type of structural variation on DNA segment, which is reported to be associated with a number of complex diseases. Array-based technology enables fasting scanning large numbers of CNV, and many statistical strategies are developed for the estimation of copy number from experimental data. The challenge comes from estimating discrete value of the copy numbers using continuous signals calling from a set of markers. Another complexity resides in simultaneously performing association testing between CNVs and diseases.
At the first stage of CNV analysis, CNV regions can be searched in relation to the trait of interest from genome-wide data. Because CNVs are rare and with low effect size, it is generally difficult to compare the frequency between cases and controls using the traditional statistical methods. Recently, DNA pooling strategy is adopted to save genotyping cost. However, CNV detection is even more challenging using pooling data. The first aim of this study is to develop a series of procedure to detect the associations between CNV and trait of interest using pooling strategy. We set a series of criteria for filtering out the noise of data and to reduce false-positive findings. We applied our procedures in an empirical CNV dataset of bipolar disorder. We first defined CNV regions for every pool. Second, we select CNV regions with different patterns between case and control pools. Finally, we integrated our findings into the mapped gene functions and the results of previous studies to explore the associations between CNV and bipolar disorder.
At the second stage of CNV analysis, we would apply clustering procedure to estimate copy numbers from the validated signals of the specified CNV region. In the situation of poor clustering quality and outlier-problem in CNV data, it is more challenging to identify correct clusters. γ-Self-updating process (SUP) is a newly developed method that could overcome the above mentioned problems, and it is also robust to the predetermination of the number of classes. The performance of γ-SUP relies on the selection of a tuning parameter τ. However, the relationship between the subjective selection rule and performance of final clustering output is unclear. The second aim of this study is to develop a selection procedure of τ in γ-SUP, based on the idea of stability. In our method, the stability is defined to be the reproducibility of clustering results, and a measure of instability is constructed using resample scheme. Simulation studies show that the proposed selection criterion does provide adequate value of τ. Furthermore, we also apply applied this method in an empirical CNV dataset of autism.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T10:26:15Z (GMT). No. of bitstreams: 1
ntu-102-R00849026-1.pdf: 1326386 bytes, checksum: 023896653c53654c25174701eea9ed0c (MD5)
Previous issue date: 2013
en
dc.description.tableofcontents誌謝 I
中文摘要 II
Abstract IV
Contents VI
List of Figures VIII
List of Tables IX
1 Introduction 1
2 Methods for Strategy I: CNV analysis procedures for pooling data 7
2.1 Subjects, DNA pooling construction and genotyping . . . . . . . . . . 7
2.2 Criteria for CNV analysis . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Exploring the CNV regions related to bipolar disorder . . . . . . . . . 10
3 Methods for Strategy II: CNV clustering 13
3.1 Review of γ-SUP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Procedure for τ selection . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Simulation 22
4.1 Simulation with p = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Simulation with p = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Simulation with large p . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5 Real data application I: BPD CNV dataset 39
5.1 Exploring the CNV regions related to BPD . . . . . . . . . . . . . . . 39
6 Real data application II: Autism CNV dataset 48
6.1 Pre-procedure and post-procedure for CNV data clustering . . . . . . 49
6.2 Numeric analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7 Discussion 56
References 61
dc.language.isoen
dc.subject過濾程序zh_TW
dc.subject穩定性zh_TW
dc.subject混合DNAzh_TW
dc.subjectγ-SUPzh_TW
dc.subject關聯性檢定zh_TW
dc.subject拷貝數變異zh_TW
dc.subject集群分析zh_TW
dc.subjectstabilityen
dc.subjectassociation testingen
dc.subjectDNA poolingen
dc.subjectfiltering proceduresen
dc.subjectclusteringen
dc.subjectγ-SUPen
dc.subjectcopy number variation (CNV)en
dc.title拷貝數變異資料關聯性檢定之分析策略zh_TW
dc.titleDevelopment of analytic strategies to improve association testing with copy number variation (CNV) dataen
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee蕭朱杏(Chuhsing Kate Hsiao),高淑芬(Shur-Fen Gau)
dc.subject.keyword拷貝數變異,關聯性檢定,混合DNA,過濾程序,集群分析,γ-SUP,穩定性,zh_TW
dc.subject.keywordcopy number variation (CNV),association testing,DNA pooling,filtering procedures,clustering,γ-SUP,stability,en
dc.relation.page66
dc.rights.note有償授權
dc.date.accepted2013-08-15
dc.contributor.author-college公共衛生學院zh_TW
dc.contributor.author-dept流行病學與預防醫學研究所zh_TW
顯示於系所單位:流行病學與預防醫學研究所

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
1.3 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved