Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62088
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor趙坤茂(Kun-Mao Chao)
dc.contributor.authorPei-Chen Pengen
dc.contributor.author彭姵晨zh_TW
dc.date.accessioned2021-06-16T13:27:05Z-
dc.date.available2016-07-31
dc.date.copyright2013-07-31
dc.date.issued2013
dc.date.submitted2013-07-23
dc.identifier.citation[1] M. Ackermann, M. Clement-Ziza, J. J. Michaelson, and A. Beyer. Teamwork: improved eqtl mapping using combinations of machine learning methods. PLoS One, 7(7):e40916, 2012.
[2] A. A. Alizadeh, M. B. Eisen, R. E. Davis, C. Ma, I. S. Lossos, A. Rosenwald, J. C. Boldrick, H. Sabet, T. Tran, X. Yu, et al. Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature, 403(6769):503–511, 2000.
[3] M. Bhattacharjee and M. J. Sillanpaa. A Bayesian mixed regression based prediction of quantitative traits from molecular marker and gene expression data. PLoS One, 6(11):e26959, 2011.
[4] H. Bolouri. Computational modelling of gene regulatory networks: a primer. World Scientific, 2008.
[5] L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
[6] R. B. Brem and L. Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences of the United States of America, 102(5):1572–1577, 2005.
[7] L. Bullinger, K. Dohner, E. Bair, S. Frohling, R. F. Schlenk, R. Tibshirani, H. Dohner, and J. R. Pollack. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. New England Journal of Medicine, 350(16):1605–1616, 2004.
[8] L. Bullinger, K. Dohner, R. Kranz, C. Stirner, S. Frohling, C. Scholl, Y. H. Kim, R. F. Schlenk, R. Tibshirani, H. Dohner, et al. An flt3 gene-expression signature predicts clinical outcome in normal karyotype aml. Blood, 111(9):4490–4495, 2008.
[9] J. Davis and M. Goadrich. The relationship between precisionrecall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233–240. ACM, 2006.
[10] A. De La Fuente, N. Bing, I. Hoeschele, and P. Mendes. Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics, 20(18):3565–3574, 2004.
[11] G. De Los Campos, H. Naya, D. Gianola, J. Crossa, A. Legarra, E. Manfredi, K. Weigel, and J. M. Cotes. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics, 182(1):375–385, 2009.
[12] D. di Bernardo, M. J. Thompson, T. S. Gardner, S. E. Chobot, E. L. Eastwood, A. P. Wojtovich, S. J. Elliott, S. E. Schaus, and J. J. Collins. Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nature Biotechnology, 23(3):377–383, 2005.
[13] R. Diaz-Uriarte and S. A. De Andres. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(1):3, 2006.
[14] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel. Misc functions of the Department of Statistics (e1071), TU Wien. R Package, pages 1–5, 2008.
[15] F. Emmert-Streib, G. V. Glazko, G. Altay, and R. de Matos Simoes. Statistical inference and reverse engineering of gene regulatory networks from observational expression data. Frontiers in Genetics, 3, 2012.
[16] J. J. Faith, B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins, and T. S. Gardner. Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology, 5(1):e8, 2007.
[17] G. Finak, N. Bertos, F. Pepin, S. Sadekova, M. Souleimanova, H. Zhao, H. Chen, G. Omeroglu, S. Meterissian, A. Omeroglu, et al. Stromal gene expression predicts clinical outcome in breast cancer. Nature Medicine, 14(5):518–527, 2008.
[18] R. Flassig, S. Heise, K. Sundmacher, and S. Klamt. An effective framework for reconstructing gene regulatory networks from genetical genomics data. Bioinformatics, 29(2):246–254, 2013.
[19] N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using bayesian networks to analyze expression data. Journal of computational biology, 7(3-4):601–620, 2000.
[20] L. Glass and S. A. Kauffman. The logical analysis of continuous, non-linear biochemical control networks. Journal of Theoretical Biology, 39(1):103 – 129, 1973.
[21] A. J. Hartemink, D. K. Gifford, T. S. Jaakkola, R. A. Young, et al. Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. In Pacific Symposium Biocomputing, volume 6, pages 422–433, 2001.
[22] M. Hecker, S. Lambeck, S. Toepfer, E. van Someren, and
R. Guthke. Gene regulatory network inference: data integration in dynamic models-a review. Biosystems, 96(1):86–103, 2009.
[23] L.-C. Huang, S.-Y. Hsu, E. Lin, et al. A comparison of classification methods for predicting chronic fatigue syndrome based on genetic data. Journal of Translational Medicine, 7(1):81, 2009.
[24] A. Irrthum, L. Wehenkel, P. Geurts, et al. Inferring regulatory networks from expression data using tree-based methods. PLoS One, 5(9):e12776, 2010.
[25] R. C. Jansen. Studying complex biological systems using multifactorial perturbation. Nature Reviews Genetics, 4(2):145–151, 2003.
[26] R. C. Jansen and J.-P. Nap. Genetical genomics: the added value from segregation. Trends in Genetics, 17(7):388–391, 2001.
[27] G. Karlebach and R. Shamir. Modelling and analysis of gene regulatory networks. Nature Reviews Molecular Cell Biology, 9(10):770–780, 2008.
[28] J. J. Keurentjes, J. Fu, I. R. Terpstra, J. M. Garcia, G. van den Ackerveken, L. B. Snoek, A. J. Peeters, D. Vreugdenhil, M. Koornneef, and R. C. Jansen. Regulatory network construction in arabidopsis by using genome-wide gene expression quantitative trait loci. Proceedings
of the National Academy of Sciences, 104(5):1708–1713,
2007.
[29] H. Lan, M. Chen, J. B. Flowers, B. S. Yandell, D. S. Stapleton, C. M. Mata, E. T.-K. Mui, M. T. Flowers, K. L. Schueler, K. F. Manly, et al. Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genetics, 2(1):e6, 2006.
[30] K.-C. Liang and X. Wang. Gene regulatory network reconstruction using conditional mutual information. EURASIP Journal on Bioinformatics and Systems Biology, 2008, 2008.
[31] A. Liaw and M. Wiener. Classification and regression by randomforest. R News, 2(3):18–22, 2002.
[32] P.-R. Loh, G. Tucker, and B. Berger. Phenotype prediction using regularized regression on genetic data in the DREAM5 Systems Genetics B Challenge. PLoS One, 6(12):e29095, 2011.
[33] A. A. Margolin, I. Nemenman, K. Basso, C. Wiggins,
G. Stolovitzky, R. D. Favera, and A. Califano. Aracne: an
algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics, 7(Suppl 1):S7, 2006.
[34] F. Markowetz. Support vector machines in bioinformatics. Master’s thesis, University of Heidelberg, 2001.
[35] L. B. J. F. R. Olshen and C. J. Stone. Classification and regression trees. Wadsworth International Group, 1984.
[36] A. Pinna, N. Soranzo, I. Hoeschele, and A. de la Fuente. Simulating systems genetics data with sysgensim. Bioinformatics, 27(17):2459–2462, 2011.
[37] H. W. Ressom, R. S. Varghese, Z. Zhang, J. Xuan, and R. Clarke. Classification algorithms for phenotype prediction in genomics and proteomics. Frontiers in Bioscience: a Journal and Virtual Library, 13:691, 2008.
[38] Y. Saeys, I. Inza, and P. Larranaga. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19):2507–2517, 2007.
[39] H. Saigo, T. Uno, and K. Tsuda. Mining complex genotypic features for predicting hiv-1 drug resistance. Bioinformatics, 23(18):2455-2462, 2007.
[40] E. E. Schadt, J. Lamb, X. Yang, J. Zhu, S. Edwards,
D. GuhaThakurta, S. K. Sieberts, S. Monks, M. Reitman, C. Zhang, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nature genetics, 37(7):710-717, 2005.
[41] I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang. Probabilistic boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics, 18(2):261-274, 2002.
[42] C. Strobl, A.-L. Boulesteix, A. Zeileis, and T. Hothorn. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 8(1):25, 2007.
[43] Z. Szallasi and S. Liang. Modeling the normal and neoplastic cell cycle with realistic boolean genetic networks: Their application for understanding carcinogenesis and assessing therapeutic strategies. In proceedings of Pacific Symposium on Biocomputing, volume 3, pages 66-76, 1998.
[44] V. G. Tusher, R. Tibshirani, and G. Chu. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences, 98(9):5116-5121, 2001.
[45] L. J. van’t Veer, H. Dai, M. J. Van De Vijver, Y. D. He, A. A. Hart, M. Mao, H. L. Peterse, K. van der Kooy, M. J. Marton, A. T. Witteveen, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415(6871):530–536, 2002.
[46] V. Vapnik. The nature of statistical learning theory. Springer, 1999.
[47] M. Vignes, J. Vandel, D. Allouche, N. Ramadan-Alban, C. Cierco-Ayrolles, T. Schiex, B. Mangin, and S. de Givry. Gene regulatory network reconstruction using bayesian networks, the dantzig selector, the lasso and their meta-analysis. PLoS ONE, 6(12):e29165, 12 2011.
[48] B. Weir. Impact of dense genetic marker maps on plant population genetic studies. Euphytica, 154(3):355–364, 2007.
[49] M. West, G. S. Ginsburg, A. T. Huang, and J. R. Nevins. Embracing the complexity of genomic data for personalized medicine. Genome Research, 16(5):559–566, 2006.
[50] M. Xiong, J. Li, and X. Fang. Identification of genetic networks. Genetics, 166(2):1037–1052, 2004.
[51] E.-J. Yeoh, M. E. Ross, S. A. Shurtleff, W. K. Williams, D. Patel, R. Mahfouz, F. G. Behm, S. C. Raimondi, M. V. Relling, A. Patel, et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.
Cancer Cell, 1:133–144, 2002.
[52] L. Zhou, S. Mideros, L. Bao, R. Hanlon, F. Arredondo, S. Tripathy, K. Krampis, A. Jerauld, C. Evans, S. St Martin, et al. Infection and genotype remodel the entire soybean transcriptome. BMC Genomics, 10(1):49, 2009.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62088-
dc.description.abstract基因表現量與基因型資料近年來成指數成長,為了廣泛運用此類型資料於各項研究,統整分析遺傳基因體資料,並尋找顯著表現基因,已成為目前的趨勢。
此研究提出一套簡單的遺傳基因體資料分析流程,首先整合基因表現量與基因型資料,接著以隨機森林演算法進行特徵選取。這套流程被應用於兩個項目: 推導基因調控網路及預測複雜表現型。十五個分別含有一千個基因的基因調控網路被推導,我們以接收操作特徵曲線下的面積及精確與檢索率曲線下面積來評量推導結果。關於預測複雜表現型方面,此套流程可被用來預測大豆的抗病能力,我們以斯皮爾曼等級相關係數來評量預測結果。實驗結果顯示,不論在模擬或真實的遺傳基因體資料,此套分析流程的效果都優於其他方法。整合基因表現量與基因型資料是分析遺傳基因體資料的關鍵步驟。此外,隨機森林演算法是一個找出顯著表現基因的理想方法。
zh_TW
dc.description.abstractThe amount of gene expression pro ling and genotype data have grown exponentially. To apply them for extensive studies, integrated analysis of genetical genomics data becomes a trend and thus identifying relevant genes of specifc response is an essential issue. We propose a simple workflow for genetical genomics data analysis which includes integration of genotype and gene expression data as well as Random Forest feature selection. The proposed workflow is utilized in two applications: gene regulatory network inference and complex phenotype prediction. Fifteen different gene networks composed of one thousand genes respectively are reconstructed. Area under Receiver Operator Characteristic curve and Precision-Recall curve are measured for inference performance. For the other application, disease susceptibility of soybean plants are predicted. Spearman's rank correlation coefficient is used for prediction evaluation. Results show that our method outperforms other methods in both simulated and real genetical genomics data. Integration of genotype and gene expression is a pivotal step in genetical genomics data analysis. And Random Forest is an ideal way to find out relevant genes for further applications.en
dc.description.provenanceMade available in DSpace on 2021-06-16T13:27:05Z (GMT). No. of bitstreams: 1
ntu-102-R00922010-1.pdf: 1119431 bytes, checksum: 7745983f7a61b52e7e245bfde7f3f338 (MD5)
Previous issue date: 2013
en
dc.description.tableofcontents口試委員會審定書 iii
致謝 v
摘要 vii
Abstract ix
1 Introduction 1
1.1 Feature Selection in Bioinformatics . . . . . . . . . . 2
1.2 Gene Regulatory Network . . . . . . . . . . . . . . . . 3
1.3 Complex Phenotype Prediction . . . . . . . . . . . . . 5
1.4 Workflow of Genetical Genomics Data Analysis and Its
Applications . . . . . . . . . . . . . . . . . . . . . . . 6
2 Genetical Genomics Data Analysis 9
2.1 Genetical Genomics . . . . . . . . . . . . . . . . . . . 9
2.2 Integration of Genotype and Gene Expression Data . . 11
2.3 Feature Selection by Random Forest . . . . . . . . . . 13
3 Gene Regulatory Network Inference 17
3.1 Genetical Genomics Data Collection . . . . . . . . . . 17
3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 Network Inference as a Feature Selection Problem 18
3.2.2 Genetical Genomics Data Analysis . . . . . . . 19
3.2.3 Ranked Edge List . . . . . . . . . . . . . . . . 20
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 Evaluation of Performance . . . . . . . . . . . 21
3.3.2 Assessment of Each Network Inference . . . . 22
4 Complex Phenotype Prediction 29
4.1 Genetical Genomics Data Collection . . . . . . . . . . 29
4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Genetical Genomics Data Analysis . . . . . . 31
4.2.2 Support Vector Regression . . . . . . . . . . . 32
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.1 Evaluation of Performance . . . . . . . . . . . 33
4.3.2 Assessment of Predictions . . . . . . . . . . . 33
5 Concluding Remarks 37
5.1 Genetical Genomics Data Analysis Performs Well in Simulated
and Real Data . . . . . . . . . . . . . . . . . . . 37
5.2 Integration of Genotype and Expression Data is Pivotal 38
5.3 Random Forest is Suitable for Genetical Genomics Data
Feature Selection . . . . . . . . . . . . . . . . . . 39
Bibliography. . . . . . . . . . . . .. . .. . 41
dc.language.isoen
dc.subject遺傳基因體zh_TW
dc.subject複雜表現型預測zh_TW
dc.subject遺傳基因體zh_TW
dc.subject基因調控網路推導zh_TW
dc.subject複雜表現型預測zh_TW
dc.subject基因調控網路推導zh_TW
dc.subjectGenetical genomicsen
dc.subjectComplex phenotype prediction.en
dc.subjectGene regulatory network reconstructionen
dc.subjectGenetical genomicsen
dc.subjectComplex phenotype prediction.en
dc.subjectGene regulatory network reconstructionen
dc.title以遺傳基因體資料推導基因調控網路及預測複雜表現型zh_TW
dc.titleGene Regulatory Network Inference and Complex Phenotype Prediction from Genetical Genomics Dataen
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee王弘倫(Hung-Lung Wang),陳怡靜(Yi-Ching Chen)
dc.subject.keyword遺傳基因體,基因調控網路推導,複雜表現型預測,zh_TW
dc.subject.keywordGenetical genomics,Gene regulatory network reconstruction,Complex phenotype prediction.,en
dc.relation.page47
dc.rights.note有償授權
dc.date.accepted2013-07-23
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
1.09 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved