請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51972完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 黃乾綱(Chien-Kang Huang) | |
| dc.contributor.author | Hsi Chen | en |
| dc.contributor.author | 陳熙 | zh_TW |
| dc.date.accessioned | 2021-06-15T14:00:29Z | - |
| dc.date.available | 2017-08-21 | |
| dc.date.copyright | 2015-08-21 | |
| dc.date.issued | 2015 | |
| dc.date.submitted | 2015-08-20 | |
| dc.identifier.citation | [1] 謝尚偉. (2014). 利用搜尋及探勘相關公開文獻的生醫詞彙快速探索私有實驗資料的資訊檢索系統. 臺灣大學工程科學及海洋工程學研究所學位論文, 1-60.
[2] Hirschman, L., Yeh, A., Blaschke, C., & Valencia, A. (2005). Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC bioinformatics, 6(Suppl 1), S1 [3] Krallinger, M., & Valencia, A. (2005). Text-mining and information-retrieval services for molecular biology. Genome biology, 6(7), 224. [4] Rebholz-Schuhmann, D., Oellrich, A., & Hoehndorf, R. (2012). Text-mining solutions for biomedical research: enabling integrative biology. Nature Reviews Genetics, 13(12), 829-839. [5] Yeh, A., Morgan, A., Colosimo, M., & Hirschman, L. (2005). BioCreAtIvE task 1A: gene mention finding evaluation. BMC bioinformatics, 6(Suppl 1), S2. [6] Settles, B. (2005). ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics, 21(14), 3191-3192. [7] Leaman, R., & Gonzalez, G. (2008, January). BANNER: an executable survey of advances in biomedical named entity recognition. In Pacific Symposium on Biocomputing (Vol. 13, pp. 652-663). [8] Hirschman, L., Colosimo, M., Morgan, A., & Yeh, A. (2005). Overview of BioCreAtIvE task 1B: normalized gene lists. BMC bioinformatics, 6(Suppl 1), S11. [9] Morgan, A. A., Lu, Z., Wang, X., Cohen, A. M., Fluck, J., Ruch, P., ... & Hirschman, L. (2008). Overview of BioCreative II gene normalization. Genome biology, 9(Suppl 2), S3. [10] Wei, C. H., Kao, H. Y., & Lu, Z. (2013). PubTator: a web-based text mining tool for assisting biocuration. Nucleic acids research, gkt441. [11] PubMed Tutorial. Available from: http://www.nlm.nih.gov/bsd/disted/pubmedtutorial/cover.html [12] PubMed. Available from: http://www.ncbi.nlm.nih.gov/pubmed [13] TAIR. Available from: https://www.arabidopsis.org/ [14] Shendure, J., & Ji, H. (2008). Next-generation DNA sequencing. Nature biotechnology, 26(10), 1135-1145. [15] Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences,74(12), 5463-5467. [16] Adams, M. D., Kelley, J. M., Gocayne, J. D., Dubnick, M., Polymeropoulos, M. H., Xiao, H., ... & Moreno, R. F. (1991). Complementary DNA sequencing: expressed sequence tags and human genome project. Science, 252(5013), 1651-1656. [17] Kim, J., Bhinge, A. A., Morgan, X. C., & Iyer, V. R. (2005). Mapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment. Nature Methods, 2(1), 47-53. [18] Velculescu, V. E., Zhang, L., Vogelstein, B., & Kinzler, K. W. (1995). Serial analysis of gene expression. Science, 270(5235), 484-487. [19] International Human Genome Sequencing Consortium. (2004). Finishing the euchromatic sequence of the human genome. Nature, 431(7011), 931-945. [20] Metzker, M. L. (2005). Emerging technologies in DNA sequencing. Genome research, 15(12), 1767-1776. [21] Mardis, E. R. (2008). The impact of next-generation sequencing technology on genetics. Trends in genetics, 24(3), 133-141. [22] Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben, L. A., ... & Volkmer, G. A. (2005). Genome sequencing in microfabricated high-density picolitre reactors. Nature, 437(7057), 376-380. [23] Bentley, D. R. (2006). Whole-genome re-sequencing. Current opinion in genetics & development, 16(6), 545-552. [24] Huang, Y. F., Chen, S. C., Chiang, Y. S., Chen, T. H., & Chiu, K. P. (2012). Palindromic sequence impedes sequencing-by-ligation mechanism. BMC systems biology, 6(Suppl 2), S10. [25] Waterston, R. H., Lander, E. S., & Sulston, J. E. (2002). On the sequencing of the human genome. Proceedings of the National Academy of Sciences, 99(6), 3712-3716. [26] Garber, M., Grabherr, M. G., Guttman, M., & Trapnell, C. (2011). Computational methods for transcriptome annotation and quantification using RNA-seq. Nature methods, 8(6), 469-477. [27] Sayers, E. (2013). A General Introduction to the E-utilities. [28] Solr. Available from: http://lucene.apache.org/solr/ | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51972 | - |
| dc.description.abstract | 在從事生物醫學相關研究中,找尋相關科學文獻並且將研究人員過去所私有的生物醫學實驗資料對應到科學文獻對於研究者是一件重要的工作。本研究提出一種關聯度排序演算法,計算文獻摘要與基因識別碼之間的關聯分數,並針對結果進行排序。然後再以基因識別碼連接私有生醫實驗資料,帶出以之對應的實驗資料數據,並以圖表在搜尋結果界面中方式呈現。其目的為讓使用者在搜尋文獻的同時能夠看到私有的實驗數據,藉此得知實驗數據在其檢索出的相關文件中的關聯性,進一步發現可研究的議題或更有價值的資訊。 | zh_TW |
| dc.description.abstract | In the biomedical research works, mapping researchers’ proprietary experiment data to public research literatures is an important work. This study presents a relevance ranking algorithm, calculate the relevance score for literature abstracts and locus names, and sort the results. Afterward, locus names are linked to the researchers’ proprietary experiment databases and using web techniques automatically plot the charts for the relevant proprietary data; through these processes, the researchers can efficiently get the relevance between their proprietary data and the public papers also can help them to find more available research works. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T14:00:29Z (GMT). No. of bitstreams: 1 ntu-104-R02525067-1.pdf: 4156301 bytes, checksum: 5e76a603b8279aa445aada35751b7611 (MD5) Previous issue date: 2015 | en |
| dc.description.tableofcontents | 致謝 i
摘要 ii ABSTRACT iii 目錄 iv 圖目錄 vii 表目錄 ix 第 1 章 緒論 1 1.1 研究背景 1 1.2 研究動機 2 1.3 研究流程 3 第 2 章 相關研究 5 2.1 專有名詞辨識與正規化 5 2.1.1 專有名詞辨識 5 2.1.2 專有名詞正規化 6 2.2 公開資料庫 7 2.2.1 PubMed文獻資料庫 7 2.2.2 TAIR基因體資料庫 9 2.3 私有實驗資料庫 11 2.3.1 次世代定序 11 2.3.2 基因表現量 12 2.4 檢索系統與私有實驗資料庫的連結 13 2.4.1 模組一:文獻檢索 14 2.4.2 模組二:蛋白質名稱辨識 15 2.4.3 模組三:名稱正規化 16 2.4.4 模組四:連結私有資料庫 18 第 3 章 問題討論與演算法設計 19 3.1 問題討論 19 3.1.1 沒有對Locus name做出查詢 19 3.1.2 Symbol與Full name混合在一起查詢 20 3.1.3 Full name 中有Symbol的狀況 20 3.1.4 Full name部分符合狀況 21 3.2 資料模型 22 3.2.1 PubMed文獻與專有名詞 22 3.2.2 專有名詞與基因識別碼 23 3.2.3 基因識別碼與實驗資料 25 3.2.4 資料模型 25 3.3 關聯度演算法之設計 27 3.3.1 專有名詞的前處理 29 3.3.2 針對Locus Name的處理規則和分數 30 3.3.3 針對Symbol 的處理規則和分數 30 3.3.4 針對 Full Name 的處理規則和分數 31 3.3.5 分數公式討論 32 第 4 章 系統設計與實驗 35 4.1 系統設計 35 4.1.1 資料蒐集 35 4.1.2 資料索引與搜尋系統 38 4.1.3 專有名詞辨識工具 40 4.1.4 私有實驗資料庫 41 4.2 實驗資料 43 4.2.1 實驗資料選擇 44 4.2.2 權重計算 44 4.2.3 實驗資料的分佈狀況 47 4.3 驗證方法與結果 49 4.3.1 驗證方法 49 4.3.2 關聯度演算法的正確率 50 4.3.3 實際案例改進討論 51 第 5 章 總結 54 5.1 結論 54 5.2 未來展望 54 參考文獻 55 | |
| dc.language.iso | zh-TW | |
| dc.subject | 文獻搜尋 | zh_TW |
| dc.subject | 關聯度排序 | zh_TW |
| dc.subject | 生物資訊 | zh_TW |
| dc.subject | Relevance ranking | en |
| dc.subject | Bioinformatics | en |
| dc.subject | Literature search | en |
| dc.title | 文獻與私有生醫實驗資料連結之關聯度排序演算法 | zh_TW |
| dc.title | Relevance Ranking Algorithm for Linking Literatures and Proprietary Experimental Biomedical Data | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 103-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 劉力瑜(Li-yu Liu),林詩舜(Shih-Shun Lin),張恆華(Heng-Hua Chang) | |
| dc.subject.keyword | 關聯度排序,生物資訊,文獻搜尋, | zh_TW |
| dc.subject.keyword | Relevance ranking,Bioinformatics,Literature search, | en |
| dc.relation.page | 56 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2015-08-20 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 工程科學及海洋工程學研究所 | zh_TW |
| 顯示於系所單位: | 工程科學及海洋工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-104-1.pdf 未授權公開取用 | 4.06 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
