請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59327完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 翁昭旼 | |
| dc.contributor.author | Jie-Ying Lin | en |
| dc.contributor.author | 林杰穎 | zh_TW |
| dc.date.accessioned | 2021-06-16T09:20:35Z | - |
| dc.date.available | 2017-07-21 | |
| dc.date.copyright | 2017-07-21 | |
| dc.date.issued | 2017 | |
| dc.date.submitted | 2017-06-30 | |
| dc.identifier.citation | 1. Outflow: Visualizing Patient Flow by Symptoms and Outcome Krist Wongsuphasawat, and David H. Gotz
2. Krist Wongsuphasawat and David Gotz. Exploring Flow, Factors, and Outocomes of Temporal Event Sequences with the Outflow Visualization. To Appear in IEEE InfoVis, 2012. 3. Nan Cao, David Gotz, Jimeng Sun, Huamin Qu. DICON: Interactive Visual Analysis of Multidimensional Clusters. IEEE InfoVis, 2011. 4. Complex Adaptive Systems, Publication 3 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science and Technology 2013- Baltimore, MD Use of Big Data and Knowledge Discovery to Create Data Backbones for Decision Support Systems Rahul Sharan Renu, Gregory Mocko*, Abhiram Koneru Department of Mechanical Engineering, Clemson University, Clemson SC 5. The Roles of Big Data in the Decision-Support Process: An Empirical Investigation Thiago Poleto(✉), Victor Diogho Heuer de Carvalho, and Ana Paula Cabral Seixas Costa 6. Observational Studies: Cohort and Case-Control Studies Jae W. Song, MD1 and Kevin C. Chung, MD, MS 6. Miao-Ching Chi, Y.-L.H., Yu-Chun Wang, The Effect of Ambient Air Quality on Respiratory Diseases in Taiwan. 2010. 7. 健保資料庫. Available from: http://nhird.nhri.org.tw/. 8. MongoDB. Available from: https://www.mongodb.com/ 9. M.G. Peterson Standardization of Process Sheet Information to Support Automated Translation of Assembly Instructions and Product-Process Coupling (2012) (December) 10. D3.js Available from: https://d3js.org/ 11. Django Available from: https://www.djangoproject.com/ 12. Scalable Vector Graphics (SVG) 1.1 (Second Edition). W3C, 2006. 13. W. Aigner, S. Miksch, B. Thurnher,, S. Biffl., 'PlanningLines: Novel Glyphs for Representing Temporal Uncertainties and Their Evaluation.', Proceedings of the International Conference on Information Visualization (IV), pp. 457-463, 2005. 14. U. Fayyad, G. Piatetsky-shapiro, P. Smyth From Data Mining to Knowledge Discovery in Databases, 17 (3) (1996), pp. 37-54 15. Hall M., National H., Frank E., Holmes G., Pfahringer B., Reutemann P., and Witten I. H., “The WEKA Data Mining Software: An Update,” 11(1), pp. 10-18. 16. P. Mui., 'Introducing Flow Visualization: visualizing visitor flow.', 2011. 17. K. Vrotsou, J. Johansson, M. Cooper, 'ActiviTree: interactive visual exploration of sequences in event-based data using graph similarity', IEEE Transactions on Visualization and Computer Graphics, pp. 945-52, 2009. 18. P. Mui., 'Introducing Flow Visualization: visualizing visitor flow.', 2011. 19. K. Wongsuphasawat, B. Shneiderman., 'Finding comparable temporal categorical records: A similarity measure with an interactive visualization.', Proceedings of the IEEE Symposium on Visual Analytics Science and Technology (VAST), pp. 27-34, 2009-Oct. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59327 | - |
| dc.description.abstract | 本研究之主要目的為提供研究人員在對醫學巨量資料庫進行研究前,一個對資料庫進行自動化知識探勘之工具,並以視覺化工具作為輔助,提供研究人員快速了解此醫學資料庫中所包含的知識。一般利用巨量醫學資料進行研究時,往往只針對單一主題而只使用小部分之資料,並沒有利用到巨量資料的優勢。本研究以疾病對照研究為例,利用疾病對照研究中log-rank test等統計方法,自動化地分析巨量就醫紀錄。
自動化分析完成後,將結果以數種視覺化工具呈現於網頁端,提供研究人員瀏覽檢視此巨量醫學資料中的知識。此研究以C型肝炎與干擾素之關係為例,挖掘出在將近1300用藥病患與14000無用藥病患間,他們未來所被診斷出所有疾病之疾病研究對照結果以及對應的Kaplan-Meier曲線圖與log-rank檢驗之P值,並以Sankey Diagram與Treemap等視覺化工具提供研究人員完整的知識瀏覽系統,能夠讓研究人員在進行研究之前,對此巨量資料中包含知識有更深的了解。因歷史性的醫學巨量資料可能需要經過許多預處理過程,分散式運算技術在挖掘巨量醫學資料中扮演重要的角色。本研究未來展望包括將不同醫學統計研究方法與資料探勘演算法套用到本自動化分析系統中,與建立知識資料庫統合結果進行二次分析等利用。 | zh_TW |
| dc.description.abstract | The purpose of this research is to provide an automated knowledge discovery tool for researchers to mine knowledge from medical big data before they use this dataset to do research. And provide visualization interface to researchers, so they can easily browse features in this dataset and knowledge we mined by previous step. When researchers use medical big data to find medical knowledge, they usually focus on small topic and use small subset of data to discover potential finding, doesn’t take advantage of big data analysis. This research use case control research design as example, with statistic methods like log rank test, automatically analyze big medical record data.
After automated analysis process, system present result by several kinds of visualizing tool on web client, provide easy interface to validate and browse result and potential knowledge. This research use relationship of hepatitis C and interferon as an example, discover case control result over every related diseases of 1300 hepatitis C patients treated with interferon and 14000 hepatitis C patients without being treated with interferon, include p-value from log rank test and Kaplan-Meier curve, also, visualizing with Sankey Diagram, Treemap .etc, provide complete knowledge browsing system to researchers, let them can gain more understanding of this big dataset before they dive into it. We may need to do lots of preprocess before analyzing historical big medical dataset, so distributed computing play an important role in mining this kind of dataset. Future work includes apply different kinds of statistic method or data mining algorithm in our automated analyzing process, and create intermediate knowledge database for further analysis and future integration. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-16T09:20:35Z (GMT). No. of bitstreams: 1 ntu-106-R04548044-1.pdf: 8539619 bytes, checksum: 0a699bfd67062197e2d59c9ab03bae0d (MD5) Previous issue date: 2017 | en |
| dc.description.tableofcontents | 誌謝 I
摘要 II Abstract III 目 錄 IV 圖目錄 VI 表目錄 VII 第一章 緒論 1 1-1研究背景與動機 1 1-2研究目的 2 1-3研究流程 2 第二章 研究材料與相關文獻探討 3 2-1疾病對照研究 3 2-2健保資料庫 5 第三章 研究方法 7 3-1研究流程 7 3-2 知識探勘 9 第四章 系統設計 11 4-1系統環境 11 4-2 NoSQL 12 4-3 MongoDB 13 4-4 D3(Data-Driven documents) 15 4-5 cache系統 19 4-6 系統架構 21 第五章 實作方法 23 5-1資料預處理 23 5-1-1健保資料庫 24 5-2 分片(Sharding) 24 5-2-1 分片目的 25 5-2-2 MongoDB分片機制 25 5-3 MapReduce 28 5-4 資料呈現 30 5-4-1 Sankey Diagram 30 5-4-2疾病節點面板 32 5-4-3就診院所差異圖 33 5-4-4 Treemap 33 第六章 系統展示 36 6-1 系統簡介 36 6-2 首頁 37 6-3 結果呈現 38 6-3-1 Sankey Diagram 39 6-3-2疾病節點面板 40 6-3-3 院所轉換圖 42 6-3-4 Treemap 43 第七章 結論與未來展望 45 參考文獻 46 | |
| dc.language.iso | zh-TW | |
| dc.subject | 疾病對照研究 | zh_TW |
| dc.subject | 資料視覺化 | zh_TW |
| dc.subject | 醫學巨量資料 | zh_TW |
| dc.subject | 醫學資料探勘 | zh_TW |
| dc.subject | 醫學知識挖掘 | zh_TW |
| dc.subject | Medical knowledge discovery | en |
| dc.subject | Case control study | en |
| dc.subject | Data mining | en |
| dc.subject | Data visualization | en |
| dc.subject | Medical big data | en |
| dc.title | 自動化巨量醫學資料知識探勘 | zh_TW |
| dc.title | Automated knowledge discovery in medical big data | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 105-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 蔣以仁 | |
| dc.contributor.oralexamcommittee | 張淑惠 | |
| dc.subject.keyword | 醫學巨量資料,醫學知識挖掘,疾病對照研究,資料視覺化,醫學資料探勘, | zh_TW |
| dc.subject.keyword | Medical big data,Medical knowledge discovery,Data mining,Data visualization,Case control study, | en |
| dc.relation.page | 47 | |
| dc.identifier.doi | 10.6342/NTU201701220 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2017-07-03 | |
| dc.contributor.author-college | 工學院 | zh_TW |
| dc.contributor.author-dept | 醫學工程學研究所 | zh_TW |
| 顯示於系所單位: | 醫學工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-106-1.pdf 未授權公開取用 | 8.34 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
