請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47583
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 黃乾綱(Chien-Kang Huang) | |
dc.contributor.author | Chun-Yi Huang | en |
dc.contributor.author | 黃駿逸 | zh_TW |
dc.date.accessioned | 2021-06-15T06:07:08Z | - |
dc.date.available | 2012-08-18 | |
dc.date.copyright | 2010-08-18 | |
dc.date.issued | 2010 | |
dc.date.submitted | 2010-08-13 | |
dc.identifier.citation | 1. Crick, F., Central dogma of molecular biology. Nature, 1970. 227(5258): p. 561-563.
2. Xue, C., et al., Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC bioinformatics, 2005. 6(1): p. 310. 3. Jiang, P., et al., MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Research, 2007. 4. Loong, K., N. Stanley, and S. Mishra, De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures. Bioinformatics, 2007. 5. Batuwita, R. and V. Palade, microPred: effective classification of pre-miRNAs for human miRNA gene prediction. Bioinformatics, 2009. 25(8): p. 989. 6. Brennecke, J., et al., bantam encodes a developmentally regulated microRNA that controls cell proliferation and regulates the proapoptotic gene hid in Drosophila. Cell, 2003. 113(1): p. 25-36. 7. Johnston, R. and O. Hobert, A microRNA controlling left/right neuronal asymmetry in Caenorhabditis elegans. Nature, 2003. 426(6968): p. 845-849. 8. Xu, P., et al., The Drosophila microRNA Mir-14 suppresses cell death and is required for normal fat metabolism. Current Biology, 2003. 13(9): p. 790-795. 9. Lee, Y., et al., MicroRNA genes are transcribed by RNA polymerase II. The EMBO journal, 2004. 23(20): p. 4051-4060. 10. Lee, R., R. Feinbaum, and V. Ambros, The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell, 1993. 75(5): p. 843-854. 11. Wightman, B., I. Ha, and G. Ruvkun, Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell, 1993. 75(5): p. 855-862. 12. Reinhart, B., et al., The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature, 2000. 403(6772): p. 901-906. 13. Sunkar, R., et al., Cloning and characterization of microRNAs from rice. The Plant Cell Online, 2005. 17(5): p. 1397. 14. Molnar, A., et al., miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii. Nature, 2007. 447(7148): p. 1126-1129. 15. Bartel, D., MicroRNAs:: Genomics, Biogenesis, Mechanism, and Function. Cell, 2004. 116(2): p. 281-297. 16. Hofacker, I., Vienna RNA secondary structure server. Nucleic Acids Research, 2003. 31(13): p. 3429. 17. Bentwich, I., et al., Identification of hundreds of conserved and nonconserved human microRNAs. Nature genetics, 2005. 37(7): p. 766-770. 18. McGinnis, S. and T. Madden, BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Research, 2004. 32(Web Server Issue): p. W20. 19. Sewer, A., et al., Identification of clustered microRNAs using an ab initio prediction method. BMC bioinformatics, 2005. 6(1): p. 267. 20. Yousef, M., et al., Combining multi-species genomic data for microRNA identification using a Naive Bayes classifier. Bioinformatics, 2006. 22(11): p. 1325. 21. Zuker, M., Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research, 2003. 31(13): p. 3406. 22. Kumar, S., F. Ansari, and V. Scaria, Prediction of viral microRNA precursors based on human microRNA precursor sequence and structural features. Virology Journal, 2009. 6(1): p. 129. 23. Chih-Hung, H., et al., Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm. BMC bioinformatics. 11. 24. Agarwal, S., et al., Prediction of novel precursor miRNAs using a context-sensitive hidden Markov model (CSHMM). BMC bioinformatics, 2010. 11(Suppl 1): p. S29. 25. Kadri, S., V. Hinman, and P. Benos, HHMMiR: efficient de novo prediction of microRNAs using hierarchical hidden Markov models. BMC bioinformatics, 2009. 10(Suppl 1): p. S35. 26. Nygaard, S., et al., Identification and analysis of miRNAs in human breast cancer and teratoma samples using deep sequencing. BMC Medical Genomics, 2009. 2(1): p. 35. 27. Chang, D., C. Wang, and J. Chen, Using a kernel density estimation based classifier to predict species-specific microRNA precursors. BMC bioinformatics, 2008. 9(Suppl 12): p. S2. 28. Xu, Y., X. Zhou, and W. Zhang, MicroRNA prediction with a novel ranking algorithm based on random walks. Bioinformatics, 2008. 24(13): p. i50. 29. Huang, T., et al., MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans. BMC bioinformatics, 2007. 8(1): p. 341. 30. Helvik, S., O. Snove, and P. Saetrom, Reliable prediction of Drosha processing sites improves microRNA gene prediction. Bioinformatics, 2007. 23(2): p. 142. 31. Hertel, J. and P. Stadler, Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics, 2006. 22(14): p. e197. 32. Chang, C. and C. Lin, LIBSVM: a library for support vector machines. 2001, Citeseer. 33. Schultes, E., P. Hraber, and T. LaBean, Estimating the contributions of selection and self-organization in RNA secondary structure. Journal of molecular evolution, 1999. 49(1): p. 76-83. 34. Freyhult, E., P. Gardner, and V. Moulton, A comparison of RNA folding measures. BMC bioinformatics, 2005. 6(1): p. 241. 35. Moulton, V., et al., Metrics on RNA secondary structures. Journal of Computational Biology, 2000. 7(1-2): p. 277-292. 36. Fera, D., et al., RAG: RNA-As-Graphs web resource. BMC bioinformatics, 2004. 5(1): p. 88. 37. Gan, H., et al., RAG: RNA-As-Graphs database--concepts, analysis, and features. Bioinformatics, 2004. 20(8): p. 1285. 38. Loong, N., Unique folding of precursor microRNAs: Quantitative evidence and implications for de novo identification. RNA, 2007. 13(2): p. 170. 39. Breiman, L., Random forests. Machine learning, 2001. 45(1): p. 5-32. 40. Oyang, Y., et al., Data classification with radial basis function networks based on a novel kernel density estimation algorithm. IEEE Transactions on Neural Networks, 2005. 16(1): p. 225-236. 41. Kavzoglu, T. and P. Mather, The role of feature selection in artificial neural network applications. International Journal of Remote Sensing, 2002. 23(15): p. 2919-2937. 42. Massey Jr, F., The Kolmogorov-Smirnov test for goodness of fit. Journal of the American Statistical Association, 1951. 46(253): p. 68-78. 43. Karolchik, D., et al., The UCSC genome browser database. Nucleic Acids Research, 2003. 31(1): p. 51. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47583 | - |
dc.description.abstract | 微型核醣核酸(microRNA)是一段非常短的非編碼核醣核酸(non-coding RNA),長度約為21~23核苷酸,基因的調控、對生物的發育有非常顯著影響。近年研究人員常利用ab initio方法以及機器學習(Machine learning)技術預測原生微型核醣核酸,並且在數據上呈現了很好的效能,因為其有著不需要序列演化資訊即可進行預測的特性。這些前人所設計的預測演算法都是針對某些特定的序列做預測,然而卻不能知道這些方法對於實際上全染色體掃描這個議題的效能表現是否也呈現一致。
在本篇論文中,我們針對預測演算法評測(Benchmark)其對於全染色體掃描的效能,希望可以得到這些預測演算法對於全染色體掃描這個議題的效能,並且找到一個適合用在全染色體掃描的預測工具。我們提出了一種系統化取樣(Systematic sampling)方式,使得利用此方法選取所得樣本的最小自由能以及配對鹼基數的分布可以和母體相似,並由母體中選取較小數量的樣本,預測工具藉由估計小數量樣本的效能,可以推得其在母體中的效能。我們利用這種系統化取樣的方法建立負資料集,以期得到各種預測演算法對於實際上全染色體掃描的效能。 最後,我們選擇了五種預測演算法進行其效能評測,Triplet-SVM在預測多環結構上擁有最好的專一性,而MiPred較適合預測最小自由能較高的單環結構序列,miR-KDE適合預測最小自由能較低的單環結構序列用以全染色體掃描發現微型核醣核酸的議題。 | zh_TW |
dc.description.abstract | MicroRNAs (miRNAs) are short non-coding RNAs (~21 – 23 nucleotides) participating in post-transcriptional regulation of gene expression. There have been many efforts on discovering miRNA precursors (pre-miRNA) over the years. Recently, ab initio approaches get more attention compared to comparative approaches because of ab initio discard sequence alignment and can discover species-specific pre-miRNAs.
Because to systematically identify miRNAs from a genome by existing experimental techniques is difficult, the use of computational methods is a key factor in miRNA discovery , However, the success of ab initial approach has not been well evaluated and extended to genome-wide miRNA discovery. In this study, a systematic analysis is performed to figure out the theoretic sampling rate that makes the evaluation statistically significant. Furthermore, we proposed a approach to reduce the negative set, and successfully generate a compact set which is smaller than the theoretic size but can yield accurate performance evaluation. Considering that there are some prevailing negative sets, this study also proposes a mathematic model that can estimate the realistic performance based on those obtained with biased datasets. Finally , 5 pre-miRNA predictors are re-evaluated based on the proposed benchmarks. The experimental results show that the proposed benchmarks can helps researchers to realize and compare the realistic performance of alternative methods. | en |
dc.description.provenance | Made available in DSpace on 2021-06-15T06:07:08Z (GMT). No. of bitstreams: 1 ntu-99-R97525076-1.pdf: 2203707 bytes, checksum: 6d347ae0c0de201df669b2c85ea5a67d (MD5) Previous issue date: 2010 | en |
dc.description.tableofcontents | 致謝 ………………………………………………………………………………….....I
摘要 …………………………………………………………………………………... II Abstract ……………………………………………………………………………….. III 目 錄 ………………………………………………………………………………....V 圖目錄 ………………………………………………………………………………..VII 表目錄 ……………………………………………………………………………... VIII CHAPTER1 緒論…………………………………………………………………… 1 CHAPTER2 相關研究……………………………………………………………… 3 2.1 微型核醣核酸……………………………………………………………… 3 2.1.1 微型核醣核酸的特徵……………………………………………... 4 2.1.2 微型核醣核酸的生物合成過程…………………………………... 6 2.1.3 微型核醣核酸的作用機制…………………………………………7 2.1.4 微型核醣核酸的資料庫 miRBase…………………………………8 2.2 預測微型核糖核酸的方法…………………………………………………..9 2.2.1 利用序列相似度的分析方法……………………………………….9 2.2.2 不依賴序列演化資訊的 ab initio方法.………...………………..10 2.3 二級結構預測演算法……………………………………………............... 11 2.4 預測原生微型核醣核酸流程………………………………………………13 2.5 預測演算法介紹…………………………………………………………....13 2.5.1 Triplet-SVM………………………………………………………. 15 2.5.2 miPred……………………………………………………………...18 2.5.3 MiPred…………………………………………………………….. 21 2.5.4 miR-KDE…………………………………………………………...23 2.5.5 microPred…………………………………………………………...25 2.6 資料集準備方式……………………………………………………………26 2.7 Kolmogorov-Smirnov單一樣本檢定……………………………………....29 CHAPTER3 研究方法……………………………………………………………….30 3.1 問題定義……………………………………………………………………30 3.2 結果評估準則………………………………………………………………33 3.3 全染色體掃描負資料集的產生方式………………………………………34 3.3.1 系統化方法及參數的決定………………………………………...34 3.3.2 Pseudo-8494 與全染色體掃描序列的分析比較…………………38 3.3.3 隨機抽樣法………………………………………………………...40 3.3.4 系統化抽樣法……………………………………………………...41 3.4 測試集正樣本的產生方法…………………………………………………44 3.5 預測演算法建構訓練集之設定……………………………………………45 CHAPTER4 實驗結果和討論……………………………………………………….47 4.1 人類原生微型核醣核酸預測結果…………………………………………48 4.2 全染色體序列預測結果……………………………………………………49 4.3 不同分類器在各區塊的預測效能…………………………………………51 4.4 與pseudo-8494效能之比較………………………………………………. 53 CHAPTER5 結論…………………………………………………………………….54 5.1 結論…………………………………………………………………………54 5.2 未來展望……………………………………………………………………55 參考文獻……………………………………………………………………………….57 | |
dc.language.iso | zh-TW | |
dc.title | 微型核醣核酸預測工具對基因體實際預測效能之研究 | zh_TW |
dc.title | Towards Realistic Benchmarks for MicroRNA Precursor Discovery Algorithms | en |
dc.type | Thesis | |
dc.date.schoolyear | 98-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 張天豪(Tien-Hao Chang),歐陽彥正(Yan-Jen Oyang),張瑞益(Ray-I Chang) | |
dc.subject.keyword | 微型核醣核酸,染色體掃描,系統化取樣,預測演算法,效能評測, | zh_TW |
dc.subject.keyword | microRNA,whole genome scan,systematic sampling,predictor,performance benchmark, | en |
dc.relation.page | 59 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2010-08-15 | |
dc.contributor.author-college | 工學院 | zh_TW |
dc.contributor.author-dept | 工程科學及海洋工程學研究所 | zh_TW |
顯示於系所單位: | 工程科學及海洋工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-99-1.pdf 目前未授權公開取用 | 2.15 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。