DNA 拷貝數在人類基因體中的分析

Peng-An Chen; 陳芃安

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37930

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	趙坤茂(Kun-Mao Chao)
dc.contributor.author	Peng-An Chen	en
dc.contributor.author	陳芃安	zh_TW
dc.date.accessioned	2021-06-13T15:51:46Z	-
dc.date.available	2008-07-03
dc.date.copyright	2008-07-03
dc.date.issued	2008
dc.date.submitted	2008-06-25
dc.identifier.citation	[1] C. D. Baldus, S. Liyanarachchi, K. Mr′ozek, H. Auer, S. M. Tanner, M. Guimond, A. S. Ruppert, N. Mohamed, R. V. Davuluri, M. A. Caligiuri, et al. Acute myeloid leukemia with complex karyotypes and abnormal chromosome 21: amplification discloses overexpression of APP, ETS2, and ERG genes. Proceedings of the National Academy of Sciences, 101:3915–3920, 2004. [2] M. T. Barrett, A. Scheffer, A. Ben-Dor, N. Sampas, D. Lipson, R. Kincaid, P. Tsang, B. Curry, K. Baird, P. S. Meltzer, Z. Yakhini, L. Bruhn, and S. Laderman. Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA. Proceedings of the National Academy of Sciences, 101:17765–17770, 12 2004. [3] A. Ben-Dor, D. Lipson, A. Tsalenko, M. Reimers, L. O. Baumbusch, M. T. Barrett, J. N. Weinstein, A.-L. Børresen-Dale, and Z. Yakhini. Framework for identifying common aberrations in DNA copy number data. In Proceedings of the 11th Annual International Conference on Research in Computational Molecular Biology. Lecture Notes in Computer Science, volume 4453, pages 122–136, 2007. [4] J. M. Bennett, D. Catovsky, M.-T. Daniel, G. Flandrin, D. A. G. Galton, H. R. Gralnick, and C. Sultan. Proposals for the classification of the acute leukaemias French-American- British (FAB) Co-operative Group. British Journal of Haematology, 33:451–458, 1976. [5] T. Bernholt, F. Eisenbrand, and T. Hofmeister. A geometric framework for solving subsequence problems in computational biology efficiently. In Proceedings of the 23rd Annual Symposium on Computational Geometry, pages 310–318, 2007. 37 [6] T. Bernholt and T. Hofmeister. An algorithm for a generalized maximum subsequence problem. In 7th Latin American Symposium, pages 178–189, 2006. [7] N. P. Carter. Methods and strategies for analyzing copy number variation using DNA microarrays. Nature Genetics, 39:S16–S21, 2007. [8] T. M. Chan. More algorithms for all-pairs shortest paths in weighted graphs. In Annual ACM Symposium on Theory of Computing, pages 590–598, 2007. [9] G. A. Churchill. Fundamentals of experimental design for cDNA microarrays. Nature Genetics, 32:490–495, 2002. [10] A. C. Davison and D. V. Hinkley. Bootstrap Methods and Their Application. Cambridge University Press, 1997. [11] S. J. Diskin, T. Eck, J. Greshock, Y. P. Mosse, T. Naylor, J. Christian J. Stoeckert, B. L. Weber, J. M. Maris, and G. R. Grant. STAC: A method for testing the significance of DNA copy number aberrations across multiple array-CGH experiments. Genome Research, 16:1149–1158, 2006. [12] B. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman & Hall/CRC, 1994. [13] P. H. C. Eilers and R. X. de Menezes. Quantile smoothing of array CGH data. Bioinformatics, 21:1146–1153, 2005. [14] W. El-Rifai, E. Elonen, M. Larramendy, T. Ruutu, and S. Knuutila. Chromosomal breakpoints and changes in DNA copy number in refractory acute myeloid leukemia. Leukemia, 11:958–963, 1997. [15] T.-H. Fan, S. Lee, H.-I. Lu, T.-S. Tsou, T.-C. Wang, and A. Yao. An optimal algorithm for maximum-sum segment and its application in bioinformatics. In 8th International Conference on Implementation and Application of Automata, pages 251–257, 2003. [16] L. Feuk, A. R. Carson, and S. W. Scherer. Structural variation in the human genome. Nature Reviews Genetics, 7:85–97, 2006. [17] J. L. Freeman, G. H. Perry, L. Feuk, R. Redon, S. A. McCarroll, D. M. Altshuler, H. Aburatani, K.W. Jones, C. Tyler-Smith, M. E. Hurles, N. P. Carter, S.W. Scherer, and C. Lee. Copy number variation: new insights in genome diversity. Genome Research, 16:949–961, 2006. [18] J. Fridlyand, A. M. Snijders, D. Pinkel, D. G. Albertson, and A. N. Jain. Hidden markov models approach to the analysis of array CGH data. Journal of Multivariate Analysis, 90:132–153, 2004. [19] A. Hamosh, A. F. Scott, J. Amberger, C. Bocchini, D. Valle, and V. A. McKusick. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research, 30:52–55, 2002. [20] R. V. Hogg and E. A. Tanis. Probability and Statistical Inference. Pearson Prentice Hall, 7th edition, 2006. [21] L. Hsu, S. G. Self, D. Grove, T. Randolph, K. Wang, J. J. Delrow, L. Loo, and P. Porter. Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics, 6(2):211–226, 2005. [22] Y.-T. Huang. A Study on Some Optimization Problems Related to SNPs and Haplotypes. PhD thesis, National Taiwan University, 2006. [23] P. Hup′e, N. Stransky, J.-P. Thiery, F. Radvanyi, and E. Barillot. Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics, 20(18):3413–3422, 2004. [24] A. J. Iafrate, L. Feuk, M. N. Rivera, M. L. Listewnik, P. K. Donahoe, Y. Qi, S. W. Scherer, and C. Lee. Detection of large-scale variation in the human genome. Nature Genetics, 36:949–951, 2004. [25] K. Inoue and J. R. Lupski. Molecular mechanisms for genomic disorders. Annual Review of Genomics and Human Genetics, 3:199–242, 2002. [26] K. Jong, E. Marchiori, A. van der Vaart, B. Ylstra, M. Weiss, and G. Meijer. Chromosomal breakpoint detection in human cancer. In Applications of Evolutionary Computing, Lecture Notes in Computer Science, volume 2611, pages 107–116, 2003. [27] A. Kallioniemi, O.-P. Kallioniemi, D. Sudar, D. Rutovitz, J. W. Gray, F. Waldman, and D. Pinkel. Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science, 258:818–821, 1992. [28] D. Komura, F. Shen, S. Ishikawa, K. R. Fitch, W. Chen, J. Zhang, G. Liu, S. Ihara, H. Nakamura, M. E. Hurles, et al. Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Research, 16:1575–1584, 2006. [29] W. R. Lai, M. D. Johnson, R. Kucherlapati, and P. J. Park. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics, 21(19):3763–3770, 2005. [30] Z.-Y. Li, D.-P. Liu, and C.-C. Liang. New insight into the molecular mechanisms of MLLassociated leukemia. Leukemia, 19:183–190, 2005. [31] H. W. Lilliefors. On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318):399–402, 1967. [32] Y.-L. Lin, T. Jiang, and K.-M. Chao. Efficient algorithms for locating the lengthconstrained heaviest segments with applications to biomolecular sequence analysis. Journal of Computer and System Sciences, 65(3):570–586, 2002. [33] O. C. Lingjærde, L. O. Baumbusch, K. Liestøl, I. K. Glad, and A.-L. Børresen-Dale. CGH-Explorer: a program for analysis of array-CGH data. Bioinformatics, 21:821–822, 2005. [34] D. Lipson, Y. Aumann, A. Ben-Dor, N. Linial, and Z. Yakhini. Efficient calculation of interval scores for DNA copy number data analysis. Journal of Computational Biology, 13(2):215–228, 2006. [35] H.-F. Liu. Constrained Searching and Ordering Problems on Sequences. PhD thesis, National Taiwan University, 2008. [36] H.-F. Liu, P.-A. Chen, and K.-M. Chao. Algorithms for computing the length-constrained max-score segments with applications to DNA copy number data analysis. In Proceedings of the 18th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, volume 4835, pages 834–845, 2007. [37] R. Lucito, J. Healy, J. Alexander, A. Reiner, D. Esposito, M. Chi, L. Rodgers, A. Brady, J. Sebat, J. Troge, et al. Representational oligonucleotide microarray analysis: A highresolution method to detect genome copy number variation. Genome Research, 13:2291– 2305, 2003. [38] S. A. McCarroll and D. M. Altshuler. Copy-number variation and association studies of human disease. Nature Genetics, 39:S37–S42, 2007. [39] D. S. Moore and G. P. McCabe. Introduction to the Practice of Statistics. W.H. Freeman & Company, 5th edition, 2006. [40] K. Mr′ozek, G. Marcucci, P. Paschka, S. P. Whitman, and C. D. Bloomfield. Clinical relevance of mutations and gene-expression changes in adult acute myeloid leukemia with normal cytogenetics: are we ready for a prognostically prioritized molecular classification? Blood, 109:431–448, 2007. [41] C. L. Myers, M. J. Dunham, S. Y. Kung, and O. G. Troyanskaya. Accurate detection of aneuploidies in array CGH and gene expression microarray data. Bioinformatics, 20(18):3533– 3543, 2004. [42] A. B. Olshen and E. S. Venkatraman. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 5(4):557–572, 2004. [43] A. Papoulis. Probability, Random Variables, and Stochastic Processes. McGraw Hill, 3rd edition, 1991. [44] F. Picard, S. Robin, M. Lavielle, C. Vaisse, and J.-J. Daudin. A statistical approach for array CGH data analysis. BMC Bioinformatics, 6:27, 2005. [45] D. Pinkel, R. Segraves, D. Sudar, S. Clark, I. Poole, D. Kowbel, C. Collins, W.-L. Kuo, C. Chen, Y. Zha, et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nature Genetics, 20:207–211, 1998. [46] J. R. Pollack, C. M. Perou, A. A. Alizadeh, M. B. Eisen, A. Pergamenschikov, C. F. Williams, S. S. Jeffrey, D. Botstein, and P. O. Brown. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nature Genetics, 23(1):41–46, 1999. [47] J. R. Pollack, T. Sorlie, C. M. Perou, C. A. Rees, S. S. Jeffrey, P. E. Lonning, R. Tibshirani, D. Botstein, A.-L. Borresen-Dale, and P. O. Brown. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proceedings of the National Academy of Sciences, 99(20):12963–12968, 2002. [48] J. Quackenbush. Microarray data normalization and transformation. Nature Genetics, 32:496–501, 2002. [49] F. G. R‥ucker, L. Bullinger, C. Schwaenen, D. B. Lipka, S. Wessendorf, S. Fr‥ohling, M. Bentz, S. Miller, C. Scholl, R. F. Schlenk, et al. Disclosure of candidate genes in acute myeloid leukemia with complex karyotypes using microarray-based molecular characterization. Journal of Clinical Oncology, 24:3887–3894, 2006. [50] R. Redon, S. Ishikawa, K. R. Fitch, L. Feuk, G. H. Perry, T. D. Andrews, H. Fiegler, M. H. Shapero, A. R. Carson, W. Chen, et al. Global variation in copy number in the human genome. Nature, 444:444–454, 2006. [51] A. Renneville, C. Roumier, V. Biggio, O. Nibourel, N. Boissel, P. Fenaux, and C. Preudhomme. Cooperating gene mutations in acute myeloid leukemia: a review of the literature. Leukemia, 22:915–931, 2008. [52] C. Rouveirol, N. Stransky, P. Hup′e, P. L. Rosa, E. Viara, E. Barillot, and F. Radvanyi. Computation of recurrent minimal genomic alterations from array-CGH data. Bioinformatics, 22(7):849–856, 2006. [53] J. Sebat, B. Lakshmi, J. Troge, J. Alexander, J. Young, P. Lundin, S. M°an′er, H. Massa, M. Walker, M. Chi, et al. Large-scale copy number polymorphism in the human genome. Science, 305:525–528, 2004. [54] S. P. Shah, W. L. Lam, R. T. Ng, and K. P. Murphy. Modeling recurrent DNA copy number alterations in array CGH data. Bioinformatics, 23:i450–i458, 2007. [55] S. S. Shapiro and M. B.Wilk. An analysis of variance test for normality (complete samples). Biometrika, 52:591–611, 1965. [56] A. J. Sharp, D. P. Locke, S. D. McGrath, Z. Cheng, J. A. Bailey, R. U. Vallente, L. M. Pertz, R. A. Clark, S. Schwartz, R. Segraves, V. V. Oseroff, D. G. Albertson, D. Pinkel, and E. E. Eichler. Segmental duplications and copy-number variation in the human genome. The American Journal of Human Genetics, 77(1):78–88, 2005. [57] S. Solinas-Toldo, S. Lampel, S. Stilgenbauer, J. Nickolenko, A. Benner, H. D‥oner, T. Cremer, and P. Lichter. Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes, Chromosomes and Cancer, 20(4):399–407, 1997. [58] E. Tuzun, A. J. Sharp, J. A. Bailey, R. Kaul, V. A. Morrison, L. M. Pertz, E. Haugen, H. Hayden, D. Albertson, D. Pinkel, et al. Fine-scale structural variation of the human genome. Nature Genetics, 37:727–732, 2005. [59] J. W. Vardiman, N. L. Harris, and R. D. Brunning. The World Health Organization (WHO) classification of the myeloid neoplasms. Blood, 100:2292–2302, 2002. [60] E. S. Venkatraman and A. B. Olshen. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics, 23(6):657–663, 2007. [61] A. R. M. von Bergh, P. M. Wijers, A. J. Groot, S. van Zelderen-Bhola, J. H. F. Falkenburg, P. M. Kluin, and E. Schuuring. Identification of a novel RAS GTPase-activating protein (RASGAP) gene at 9q34 as an MLL fusion partner in a patient with de novo acute myeloid leukemia. Genes, Chromosomes and Cancer, 39:324–334, 2004. [62] H. Willenbrock and J. Fridlyand. A comparison study: applying segmentation to array CGH data for downstream analyses. Bioinformatics, 21(22):4084–4091, 2005. [63] A. F. Wright. Nature Encyclopedia of the Human Genome 2, pages 959–968. Nature Publishing Group, London, 2003. [64] Y. Yamashita, K. Minoura, T. Taya, S. i Fujiwara, K. Kurashina, H. Watanabe, Y. L. Choi, M. Soda, H. Hatanaka, M. Enomoto, et al. Analysis of chromosome copy number in leukemic cells by different microarray platforms. Leukemia, 21:1333–1337, 2007.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37930	-
dc.description.abstract	拷貝數變異 (Copy Number Variation, CNV) 是一種人類基因體中的結構變異，並且已知與多種遺傳疾病相關。微陣列基因體比較雜合法 (Array Comparative Genomic Hybridization, Array CGH) 可以提供生物學家高解析度的DNA 拷貝數分析，由於微陣列基因體比較雜合法的解析度不斷上升，其信雜比也逐漸提高。為了處理這些雜訊，微陣列基因體比較雜合法之實驗數據將再被分析，以找出人類基因體中拷貝數變異的區段。Lipson 等人提出一個統計上的架構去精確地找出人類基因體中拷貝數變異的邊界並提供各區段之顯著性指標。在這個架構中，微陣列基因體比較雜合法之雜訊被假設為常態分佈。然而目前尚未有研究支持此假設。同時，許多統計的方法也是基於同樣的假設。在本論文中，我們不對雜訊的分佈作假設，而提出一個改進的架構，並同時發展一套系統性的方法來選擇此架構中的參數。在我們的架構之下，我們使用一個由Bernholt 等人所提出的演算法來找出人類基因體中的拷貝數變異區段。然而，Bernholt 等人所提出的演算法無法分別地找出DNA 片段之複製與刪除事件。因此，我們也提出一個線性時間演算法來分別地找出DNA 片段之複製與刪除事件。我們使用此改進的架構去分析一個急性骨髓性白血病的微陣列基因體比較雜合法之實驗數據。我們的方法可以更精準地找出拷貝數變異的位置並找到許多包含與急性骨髓性白血病相關之基因的DNA 區段。	zh_TW
dc.description.abstract	Copy number variations (CNVs) are one kind of structural variations in the human genome and are associated with many genetic diseases. Array CGH approaches can provide biologists high resolution analysis of DNA copy number data. Since the resolution of array CGH approaches is increasing, the signal-to-noise ratio is also getting higher in recent array CGH approaches. To handle the noise in the array CGH approaches, the experimental results are further analyzed to locate the copy number variations in the human genome. Lipson et al. [Journal of Computational Biology, 13(2):215-228, 2006] propose a statistical framework which enables us to find the boundaries of copy number variations in the human genome accurately and provides the significance for each aberration calling. It is assumed that the noise in the array CGH data is normally distributed in the framework. However, there is no evidence supporting this assumption. Furthermore, many statistical approaches also suffer this problem. In this thesis, we propose an improved framework without making the assumption. We also develop a systematic method for selecting the parameters in our framework. A linear time algorithm proposed by Bernholt et al. [7th Latin Americal Symposium, pages 178-189, 2006] is used to find copy number variations under this framework. However, their algorithm cannot find duplication events and deletion events of the human genome separately. Thus, a linear time algorithm for this purpose is proposed. We demonstrate the power of our methods by applying them to an array CGH dataset from leukemia patients. Our methods locate the CNVs in the array CGH data more accurately and finds regions which contain genes related to the acute myeloid leukemia.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T15:51:46Z (GMT). No. of bitstreams: 1 ntu-97-R95922022-1.pdf: 3085228 bytes, checksum: 9dbc9aef8e7c9c95527ce28c428811fd (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	1 Introduction 1 1.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Problem Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 The Aberrant Segments Finding Problem 9 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Lower Bound Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Analysis of DNA Copy Number Data among Multiple Samples . . . . . . . . . . 22 3 The Amplification Segments Finding Problem 30 3.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2 Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4 Concluding Remarks 35 4.1 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Bibliography 37
dc.language.iso	en
dc.subject	急性骨髓性白血病	zh_TW
dc.subject	微陣列基因體比較雜合法	zh_TW
dc.subject	拷貝數變異	zh_TW
dc.subject	Acute Myeloid Leukemia	en
dc.subject	Copy Number Variation	en
dc.subject	Array Comparative Genomic Hybridization	en
dc.title	DNA 拷貝數在人類基因體中的分析	zh_TW
dc.title	DNA Copy Number Data Analysis in Human Genomes	en
dc.type	Thesis
dc.date.schoolyear	96-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	徐熊健,沈錳坤,林耀鈴
dc.subject.keyword	拷貝數變異,微陣列基因體比較雜合法,急性骨髓性白血病,	zh_TW
dc.subject.keyword	Copy Number Variation,Array Comparative Genomic Hybridization,Acute Myeloid Leukemia,	en
dc.relation.page	44
dc.rights.note	有償授權
dc.date.accepted	2008-06-25
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf 未授權公開取用	3.01 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。