請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23041
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 莊曜宇(Eric Y. Chuang) | |
dc.contributor.author | Chien-Yueh Lee | en |
dc.contributor.author | 李建樂 | zh_TW |
dc.date.accessioned | 2021-06-08T04:39:02Z | - |
dc.date.copyright | 2009-08-18 | |
dc.date.issued | 2009 | |
dc.date.submitted | 2009-08-14 | |
dc.identifier.citation | 1. Lagos-Quintana, M., et al., Identification of novel genes coding for small expressed RNAs. Science, 2001. 294(5543): p. 853-8.
2. Lagos-Quintana, M., et al., Identification of tissue-specific microRNAs from mouse. Curr Biol, 2002. 12(9): p. 735-9. 3. Lee, R.C. and V. Ambros, An extensive class of small RNAs in Caenorhabditis elegans. Science, 2001. 294(5543): p. 862-4. 4. Lee, R.C., R.L. Feinbaum, and V. Ambros, The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell, 1993. 75(5): p. 843-54. 5. Lee, S.S., et al., DAF-16 target genes that control C. elegans life-span and metabolism. Science, 2003. 300(5619): p. 644-7. 6. Reinhart, B.J., et al., The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature, 2000. 403(6772): p. 901-6. 7. Slack, F.J., et al., The lin-41 RBCC gene acts in the C. elegans heterochronic pathway between the let-7 regulatory RNA and the LIN-29 transcription factor. Mol Cell, 2000. 5(4): p. 659-69. 8. Lim, L.P., et al., Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature, 2005. 433(7027): p. 769-73. 9. John, B., et al., Human MicroRNA targets. PLoS Biol, 2004. 2(11): p. e363. 10. Lewis, B.P., C.B. Burge, and D.P. Bartel, Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell, 2005. 120(1): p. 15-20. 11. Lewis, B.P., et al., Prediction of mammalian microRNA targets. Cell, 2003. 115(7): p. 787-98. 12. Krek, A., et al., Combinatorial microRNA target predictions. Nat Genet, 2005. 37(5): p. 495-500. 13. Kiriakidou, M., et al., A combined computational-experimental approach predicts human microRNA targets. Genes Dev, 2004. 18(10): p. 1165-78. 14. Rehmsmeier, M., et al., Fast and effective prediction of microRNA/target duplexes. RNA, 2004. 10(10): p. 1507-17. 15. Kim, S.K., et al., miTarget: microRNA target gene prediction using a support vector machine. BMC Bioinformatics, 2006. 7: p. 411. 16. Wang, X. and I.M. El Naqa, Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics, 2008. 24(3): p. 325-32. 17. Yang, Y., Y.P. Wang, and K.B. Li, MiRTif: a support vector machine-based microRNA target interaction filter. BMC Bioinformatics, 2008. 9 Suppl 12: p. S4. 18. Huang, J.C., et al., Using expression profiling data to identify human microRNA targets. Nat Methods, 2007. 4(12): p. 1045-9. 19. Sethupathy, P., B. Corda, and A.G. Hatzigeorgiou, TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA, 2006. 12(2): p. 192-7. 20. Sethupathy, P., M. Megraw, and A.G. Hatzigeorgiou, A guide through present computational approaches for the identification of mammalian microRNA targets. Nat Methods, 2006. 3(11): p. 881-6. 21. Wightman, B., I. Ha, and G. Ruvkun, Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell, 1993. 75(5): p. 855-62. 22. Calin, G.A. and C.M. Croce, MicroRNA signatures in human cancers. Nat Rev Cancer, 2006. 6(11): p. 857-66. 23. Calin, G.A. and C.M. Croce, MicroRNA-cancer connection: the beginning of a new tale. Cancer Res, 2006. 66(15): p. 7390-4. 24. Esquela-Kerscher, A. and F.J. Slack, Oncomirs - microRNAs with a role in cancer. Nat Rev Cancer, 2006. 6(4): p. 259-69. 25. Brown, J.R. and P. Sanseau, A computational view of microRNAs and their targets. Drug Discov Today, 2005. 10(8): p. 595-601. 26. Baum, L.E. and J.A. Egon, An inequality with applications to statistical estimation for probabilistic functions of a Markov process and to a model for ecology. Bull. Amer. Meteorol. Soc., 1967. 73: p. 360-363. 27. Rabiner, L.R. and B.H. Juang, Fundamentals of Speech Recognition. 1993: Prentice-Hall Signal Processing Series, ed. A.V. Oppenheim. 28. Wang, T.C., Design of Modular Scalable HMM-based Continuous Speech Recognition IP, in Department of Electrical Engineering. 2002, National Cheng Kung University. p. 115. 29. Chaudhuri, K. and R. Chatterjee, MicroRNA detection and target prediction: integration of computational and experimental approaches. DNA Cell Biol, 2007. 26(5): p. 321-37. 30. Stark, A., et al., Identification of Drosophila MicroRNA targets. PLoS Biol, 2003. 1(3): p. E60. 31. Hofacker, I.L., Vienna RNA secondary structure server. Nucleic Acids Res, 2003. 31(13): p. 3429-31. 32. Enright, A.J., et al., MicroRNA targets in Drosophila. Genome Biol, 2003. 5(1): p. R1. 33. Liu, H. and L. Wong, Data mining tools for biological sequences. J Bioinform Comput Biol, 2003. 1(1): p. 139-67. 34. Griffiths-Jones, S., miRBase: the microRNA sequence database. Methods Mol Biol, 2006. 342: p. 129-38. 35. Griffiths-Jones, S., et al., miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res, 2006. 34(Database issue): p. D140-4. 36. Mignone, F., et al., UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res, 2005. 33(Database issue): p. D141-6. 37. Xiao, F., et al., miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res, 2009. 37(Database issue): p. D105-10. 38. Boyer, R.S. and J.S. Moore, A fast string searching algorithm. Communications of the ACM, 1977. 20: p. 762-772. 39. Eddy, S.R., Profile hidden Markov models. Bioinformatics, 1998. 14(9): p. 755-63. 40. Didiano, D. and O. Hobert, Perfect seed pairing is not a generally reliable predictor for miRNA-target interactions. Nat Struct Mol Biol, 2006. 13(9): p. 849-51. 41. Wuchty, S., et al., Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers, 1999. 49(2): p. 145-65. 42. Tseng, S.W., Prediction of the Phosphorylation Sites in Protein Sequences Using Profile Hidden Markov Model, in Department of Life Science. 2004, National Central University. p. 58. 43. Flynt, A.S. and E.C. Lai, Biological principles of microRNA-mediated regulation: shared themes amid diversity. Nat Rev Genet, 2008. 9(11): p. 831-42. 44. Kertesz, M., et al., The role of site accessibility in microRNA target recognition. Nat Genet, 2007. 39(10): p. 1278-84. 45. Miranda, K.C., et al., A pattern-based method for the identification of MicroRNA binding sites and their corresponding heteroduplexes. Cell, 2006. 126(6): p. 1203-17. 46. Baskerville, S. and D.P. Bartel, Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. RNA, 2005. 11(3): p. 241-7. 47. Babak, T., et al., Probing microRNAs with microarrays: tissue specificity and functional inference. RNA, 2004. 10(11): p. 1813-9. 48. Grimson, A., et al., MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell, 2007. 27(1): p. 91-105. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23041 | - |
dc.description.abstract | 微核醣核酸 (microRNA) 是一種短片段的非編碼核醣核酸 (non-coding RNA) ,平均約22個核苷酸長,能在動物中扮演著抑制目標基因的蛋白質轉譯功能。然而,要預測動物microRNA的目標基因卻是有相當的困難與挑戰,原因在於大部分的microRNA和目標基因呈現非完整的互補。為了增加目標基因的預測效果,在此我們闡述一個嶄新的預測microRNA目標基因演算法,首先建立一個在預測上過去經常被使用到的模型,其包含藉著尋找互補序列來計算序列比對的分數,以及利用熱力學的方法計算microRNA和目標基因之間的自由能。除此之外,我們引進一種著名的機器學習方法-隱藏馬可夫模型 (HMM) ,作為這個預測模型最終結果的決策工具。但是基於HMM本身的限制,無法對整條序列資訊作全盤的考量,因此,在我們所提出的演算法中引入了一個突破傳統HMM限制的觀念-同時利用正向與反向HMM,應用此觀念將可以同時取得序列中任一元素往前一個元素和往後一個元素的資訊。
本篇論文將探討這些模型在不同的組合之下,個別之最大敏感性、特異性以及整體的正確率。此外,我們也設計了二種驗證的方式,分別為:利用一群由目前已經發表的其他演算法所整理出來的預測基因;和利用一群由微陣列晶片所找出來的低表現基因,來分別證明我們提出的演算法預測結果之好壞。根據實驗的結果,此演算法的最大敏感性、特異性和整體的正確率分別為84.25%、96.78%和96.67%。在比較其他的演算法和低表現的基因這二項驗證也分別有52.42%和70.37%的重疊率。 | zh_TW |
dc.description.abstract | MicroRNAs (miRNAs) are short non-coding RNAs about 22 nucleotides that play important regulatory roles in animals for translational repression. Nevertheless, it is a difficult challenge to predict targets in animals because of their much more imperfect complementarity between microRNAs and mRNAs. In order to further improve the prediction performance, we propose a novel microRNA target-gene prediction algorithm which combines several conventional prediction models such as the sequence complementary searching for calculating alignment scores and thermodynamic stability approaches for assigning folding free energy to each microRNA-target interactions. Besides, it includes a Hidden Markov Model (HMM), which is a famous machine learning approach, to help the prediction decision. However, due to its innate limitation, HMM can’t consider all the global information of the sequences. Hence, in order to overcome this limitation, forward and backward HMMs are simultaneously utilized in the proposed algorithm. As a result, it can make any element information of microRNA-target interactions able to pass to any other element by bi-directions.
In this thesis, the author calculates the highest sensitivity, specificity, and overall accuracy in the different combination of the proposed models. And it also uses the predicted genes from existing prediction algorithms and down-regulated genes from microarray data to demonstrate the correctness of the proposed algorithm. According to the simulation result, the corresponding sensitivity, specificity, and overall accuracy are 84.25%, 96.78%, and 96.67%, respectively in the complete prediction models. And it is determined that 52.42% and 70.37% overlap rates predicted by the proposed algorithm also can be estimated in other existing prediction algorithms and the down-regulated results of microarray data, respectively. | en |
dc.description.provenance | Made available in DSpace on 2021-06-08T04:39:02Z (GMT). No. of bitstreams: 1 ntu-98-R96945005-1.pdf: 636999 bytes, checksum: 77d58946f0574bb8d5d94097c1306157 (MD5) Previous issue date: 2009 | en |
dc.description.tableofcontents | Chapter 1 Introduction and Background .......................................................................... 1
1.1 Introduction ................................................................................................ 1 1.1.1 Motivation ....................................................................................... 1 1.1.2 The Specific Aims ........................................................................... 3 1.2 Background................................................................................................. 3 1.2.1 microRNAs...................................................................................... 3 1.2.2 microRNA Biogenesis..................................................................... 4 1.2.3 Post-transcriptional Repression by microRNAs.............................. 5 1.2.4 The Hidden Markov Model ............................................................. 5 1.2.5 The Viterbi Algorithm................................................................... 10 Chapter 2 Literature Review .......................................................................................... 12 2.1 The Sequence Complementary Searching and Thermodynamic Stability Approaches ............................................................................................................. 13 2.1.1 TargetScan..................................................................................... 14 2.1.2 miRanda......................................................................................... 15 2.1.3 RNAhybrid .................................................................................... 15 2.1.4 DIANA-microT ............................................................................. 16 2.1.5 PicTar ............................................................................................ 17 2.2 Machine Learning Approaches................................................................. 17 2.2.1 miTarget ........................................................................................ 17 2.2.2 MiRTif........................................................................................... 18 2.2.3 GenMiR++..................................................................................... 18 Chapter 3 Materials and Methods................................................................................... 20 3.1 Sequence Retrieval ................................................................................... 20 3.1.1 microRNA Sequences.................................................................... 20 3.1.2 3’ UTR Sequences......................................................................... 20 3.2 Experimentally Validated microRNA Target Sets ................................... 20 3.2.1 TarBase.......................................................................................... 21 3.2.2 miRecords...................................................................................... 21 3.3 The Proposed System Flow ...................................................................... 22 3.3.1 Conversion of Identical Gene Symbols ......................................... 24 3.3.2 Seed Region Pairing and Segmentation......................................... 24 3.3.3 Improved Position-weighted Global Alignment Algorithm.......... 26 3.3.4 Calculation of ΔG Values.............................................................. 28 3.3.5 Evaluation of Thresholds and the α, β Parameters ........................ 29 3.3.6 Profile Hidden Markov Model ...................................................... 31 3.3.6.1 Construction of Predictive HMMs ..................................... 32 3.3.6.2 Expectation Values and HMMER Bit Scores .................... 33 3.3.6.3 Building Predictive Models................................................ 34 3.3.7 Calculation of Accuracy and the Confusion Matrix...................... 35 3.4 Implementation Environment ................................................................... 36 Chapter 4 Results............................................................................................................ 37 4.1 Datasets Selection..................................................................................... 37 4.2 Build Predictive Models ........................................................................... 38 4.2.1 Sequence Complementary Searching and Thermodynamic Stability Approaches ..................................................................................................... 38 4.2.2 Only the Forward or Backward HMM. ......................................... 41 4.2.3 Only the Forward or Backward HMM with Sequence Complementary Searching and Thermodynamic Stability Approaches......... 42 4.2.4 The Forward and Backward HMMs with Sequence Complementary Searching and Thermodynamic Stability Approaches ................................... 46 4.3 Comparing the Overlap Genes with Existing Prediction Algorithms ...... 48 4.4 Supporting Evidence from Microarray Data ............................................ 50 Chapter 5 Discussions .................................................................................................... 54 Chapter 6 Conclusion ..................................................................................................... 59 References ...................................................................................................................... 61 | |
dc.language.iso | en | |
dc.title | 使用隱藏馬可夫模型預測microRNA之目標基因 | zh_TW |
dc.title | Prediction of microRNA Target Genes Using a Hidden Markov Model | en |
dc.type | Thesis | |
dc.date.schoolyear | 97-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 賴亮全(Liang-Chuan Lai),蔡孟勳(Mong-Hsun Tsai),高成炎(Cheng-Yan Kao) | |
dc.subject.keyword | 微核醣核酸,隱藏馬可夫模型,目標基因預測,序列比對, | zh_TW |
dc.subject.keyword | microRNA,miRNA,hidden Markov model,HMM,microRNA target,target gene prediction,sequence alignment, | en |
dc.relation.page | 65 | |
dc.rights.note | 未授權 | |
dc.date.accepted | 2009-08-14 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 生醫電子與資訊學研究所 | zh_TW |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-98-1.pdf 目前未授權公開取用 | 622.07 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。