請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/25706完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 翁昭旼(Jau-Min Wong),蔣以仁(I-Jen Chiang) | |
| dc.contributor.author | You-Sheng Li | en |
| dc.contributor.author | 李祐陞 | zh_TW |
| dc.date.accessioned | 2021-06-08T06:25:50Z | - |
| dc.date.copyright | 2006-07-31 | |
| dc.date.issued | 2006 | |
| dc.date.submitted | 2006-07-27 | |
| dc.identifier.citation | 【1】R. Roiger, M. Geatz , Data Mining A Tutorial-based Primer. Addison Wesley, 2003.
【2】M. Buchanan, NEXUS:Small Worlds and the Groundbreaking Science of Networks. NY:Norton, 2002. 【3】H.C. Tsai, Web-base Literature Clustering Search. 2005 【4】R. McAleese, A theoretical view on concept mapping. Association for Learning Technology Journal, Vol. 2, No. 1, 38–48, 1994 【5】L. Schultze, D.E. Leidner, Studying Knowledge Management in Information System Research: Discourses and Theoretical Assumptions. MIS Quarterly, Vol. 26, No. 3, 2002 【6】J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation. ACM Press, 1-12, 1999. 【7】D. Watts, S. Strogatz, Collective Dynamics of Small-World Networks. Nature, Vol. 393, 440-442, 1998. 【8】B. Sergey, P. Lawrence, The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems, 30(1-7), 107-117, 1998. 【9】B.Y. Ricardo, R.N. Berthier, Modern Information Retrieval. ACM Press Series/Addison Wesley, 1999. 【10】C.T. Meadow, B.R. Boyce, D.H. Kraft, Text Information Retrieval System. Academic Press, 2000. 【11】N. Nanas, V. Uren, A. Roeck, Building and applying a concept hierarchy representation of a user profile. SIGIR, 198—204, 2003. 【12】D. Lewis, Reuters-21578 text categorization test collection Distribution 1.0. 1997. 【13】W. Hersh, C. Buckley, T.J. Leone, D. Hickam, OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research. ACM SIGIR, 192–201, 1994. 【14】C. Chen, CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359-377, 2006. 【15】C. Chen, Searching for intellectual turning points: Progressive Knowledge Domain Visualization. Proceedings of the National Academy of Sciences of the United States of America (PNAS), 101 , 5303-5310, 2004. 【16】W. Pedrycz, Knowledge-Based Clustering:From Data to Information Granules. Wiley, 2005. 【17】R. Agrawal, T. Imielinski, A. Swami, Mining Association Rules between Sets of Items in Large Databases. SIGMOD, 207-216, 1993. 【18】Entrez PubMed, http://www.ncbi.nlm.nih.gov/entrez/ 【19】F. Harary, “Graph Theory”, Addison-Wesley, 1994. 【20】J. B. MacQueen, Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability', 1:281-297, 1967. 【21】T. Kohonen, The self-organizing map. IEEE Digital Object Identifier, Vol. 78, Issue 9, 1464-1480, 1990. 【22】B. Adamcsek, Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435, 814–818, 2005 【23】I. Derényi, G. Palla, T. Vicsek, Clique Percolation in Random Networks. Physical Review Letters, Vol. 94, Issue 16, 2005. 【24】I.J. Chiang, Discover the semantic Topology in High-Dimensional Data. Expert Systems with Applications, 33(1), September, 2007. 【25】T. Joachims, Transductive inference for text classification using support vector machines. ICML, 200-209, 1999. 【26】H. Liu, Z.Z. Hu, J.Z. Wu, C. Wu, Biothesaurus:a web-based thesaurus of protein and gene names. BMC Bioinformatics, Vol. 22, NO. 1, 103-105, 2006. 【27】D. Ian, M. Joel, PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics, 4-11, 2003. 【28】J.J. Lars, J. Saric , P. Bork, Literatue mining for the biologist:from information retrieval to biological discovery.Nature Review Genetics 7, 119-129, 2006. 【29】C. Ding, X. He, P. Husbands, H. Zha, H.Simon, PageRank, HITS and a unified framework for link analysis. ACM Press, 353-354, 2002. 【30】G.W. Flake, S. Lawrence, C.L. Giles, F.M. Coetzee, Self-organization and identification of Web communities. IEEE, Volume 35, Issue 3, 66-70, 2002. 【31】KartOO, http://www.kartoo.com/ 【32】H. Hu, X. Yan, Y. Hung, J. Han, X.J. Z, Mining coherent dense subgraphs across massive biological networks for functional discovery. BMC Bioinformatics, vol.21, 213-221, 2005. 【33】A.L. Hsu, S.L. Tang, S.K. Halgamuge, An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data. BMC Bioinformatics, vol.19, 2131-2140, 2003. 【34】T.Y. Lin, I.J. Chiang, A simplicial complex, a hypergraph, structure in the latent semantic space of document clustering. Elsevier Inc, 55-80, 2005. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/25706 | - |
| dc.description.abstract | 異質性資料在文件上的共現問題導致了複雜的結構,如何解釋它們之間的關聯一直以來是很多研究者想解決的問題。尤其現今電腦網際網路(Internet)時代來臨,大部份的人皆被網路便利性、快速性等性質深深吸引著,人們漸漸以網際網路作為尋找資料、分享資料的主要管道,使得文字電子資訊量大增,在文獻、網頁、新聞或企業文件量上皆成指數成長,因此如何有效管理這些大量文件變成一個重要議題。
本論文主要目的是發展一套生醫文獻自動化分群系統,希望能從這些散亂的文獻中自動化將類似領域主題知識聚集在一起。藉此幫助使用者在面對龐大的醫學文獻時能有效、快速瞭解其知識結構內容。在這篇論文中我們以關聯法則實作Clique Percolation Method Simplex概念,最後與Literature Clustering Search在Reuters- 21578與OHSUMED兩個文件分類測試集(Benchmark)上評估其Precision、Recall、Normalized mutual Information、Pairwise Testing之間的差異。 | zh_TW |
| dc.description.abstract | The co-occurrence of items in data always induces a complex structure. Many researchers try to discover them. However, heterogeneity lets the data hard to analysis. Especially associated with the arrival of the Internet era, most of the people become deeply attract to the convenience and effectiveness of Internet, therefore, try to find a way to explain its model. As Internet has gradually become a major access for people to search for information and share it with others, which brings about the large increase in electronic texts—the growth in the number of literature, web pages, news reports, and business documents is exponential. Therefore, how to effectively arrange this large amount of texts has become a crucial issue.
This essay aims to develop a set of automatic biomedical literature clustering system and compare them. Hopefully, it will be able to automatically arrange these disorderly texts into an organized knowledge database, in the meantime categorizing them according to different themes and fields. We hope this system will be of help to its users to effectively grasp the structure and content of the knowledge they seek for when they encounter such great deal of medical literature. In this thesis, we apply the association rule to the clique percolation method and the concept of simplex. Then, for the literature clustering search, we will adopt two text categorization and collection benchmarks—Reuters-21578 and OHSUMED, discerning the differences of the precision, recall, normalized mutual information, and pairwise testing of the two. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T06:25:50Z (GMT). No. of bitstreams: 1 ntu-95-R93548054-1.pdf: 1737025 bytes, checksum: 22071650af77908ccb7ca9087a4c2db8 (MD5) Previous issue date: 2006 | en |
| dc.description.tableofcontents | 中文摘要 I
英文摘要 II 致謝 III 目錄 IV 表目錄 VI 圖目錄 VIII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 研究材料與來源 3 1.4 論文架構 4 第二章 相關研究 6 2.1 小世界模型 6 2.2 關聯分析 8 2.3 文獻參考圖 9 2.4 群聚分析 11 2.4.1 文件相似度 11 2.4.2 分割式群集演算法 12 2.4.3 階層式群集演算法 13 2.4.4 自我組織映射圖 14 2.4.5 叢集化搜尋 14 2.5 拓樸結構 16 第三章 基礎定義與方法 17 3.1 文章粹取 17 3.1.1 剖析器 18 3.1.2 關鍵字粹取 19 3.1.3 關鍵字詞權重計算 20 3.1.4 關聯法則 21 3.2 Simplex拓樸結構說明 23 3.3 圖學理論 25 第四章 視覺化群集系統 30 4.1 CPM Community Map 30 4.2 Simplex Community Map 32 第五章 評估與討論 33 5.1 分類測試集 33 5.1.1 Reuters-21578 33 5.1.2 OHSUMED 34 5.2 評估方法 36 5.3 實驗設計 39 5.4 評估結果 39 5.5 時間評估 55 5.6 討論 56 第六章 結論 58 參考文獻 59 | |
| dc.language.iso | zh-TW | |
| dc.subject | 社群 | zh_TW |
| dc.subject | 文字探勘 | zh_TW |
| dc.subject | 群聚分析 | zh_TW |
| dc.subject | Cluster Analysis | en |
| dc.subject | Text mining | en |
| dc.subject | Community | en |
| dc.title | 生醫文獻自動化分群系統與評估 | zh_TW |
| dc.title | Automatic Biomedical Literature Clustering System and Evaluations | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 94-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳中明 | |
| dc.subject.keyword | 群聚分析,社群,文字探勘, | zh_TW |
| dc.subject.keyword | Cluster Analysis,Community,Text mining, | en |
| dc.relation.page | 62 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2006-07-28 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 醫學工程學研究所 | zh_TW |
| 顯示於系所單位: | 醫學工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-95-1.pdf 未授權公開取用 | 1.7 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
