請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3859
標題: | 巨量資料視覺化模型建構之探討大腸癌盛行率與飲用水質關係 Big Data Visualization System Design and Research of Interaction between Colorectal Cancer and Drinking Water |
作者: | Hao Wang 王浩 |
指導教授: | 翁昭旼(Jau-Min Wong) |
共同指導教授: | 蔣以仁(I-Jen Chiang) |
關鍵字: | 健保資料庫,NoSQL資料庫,疾病地圖,mongoDB,巨量資料,資料視覺化, Nation Health Insurance Research Database,NoSQL Database,Disease Mapping,Big Data,Data Visualization, |
出版年 : | 2016 |
學位: | 碩士 |
摘要: | 資料視覺化是本論文的主軸,它也是使用者與資料溝通最直接的方法,利用D3.js函式庫建立整套疾病地圖系統,透過互動事件與動畫呈現讓使用者感受資料的特性;透過疾病地圖從空間面向觀察臺灣整體的疾病分布與趨勢;透過疾病趨勢圖觀察特定縣市之時間面向資料,展示疾病各年度間的盛行率、通報數以及平均年齡…等不同資訊,同時系統也提供一個便於分析資料的介面清楚地比較不同縣市的差異;利用非同步技術動態載入環境資料庫(自來水水質資料庫、水庫水質資料庫),在整合趨勢圖中隨著時間演進泡泡的位置移動與大小的縮放除了可以瞭解疾病的趨勢也可以觀察特定環境屬性是否與疾病有一定程度上的相關。
本研究系統採用國家衛生研究院全民健康保險研究資料庫百萬人抽樣歸人檔做為疾病資料基礎;為了處理如此大量資料而選用NoSQL資料庫MongoDB做為資料儲存的系統,利用mapreduce技術提升在分散式資料庫查詢的效能並能執行較複雜的運算,剔除不符合條件的就醫紀錄並將原本歸人的資料依據區碼歸檔,建立cache系統避免頻繁的資料庫伺服器存取,將系統資源做最有效的應用。系統開發採用MVC架構,讓系統模組化以增加其擴充性,可依據使用者查詢的疾病代碼(ICD-9)載入適當的預測模組或者功能模組。 The core of this thesis, Data visualization, is a way of user communication with data. Using D3.js tools to build this disease mapping system, which allows user to feel the change of the data by events selection and animation. With the Disease Map function, users are able to observe the distribution of the disease in spatial aspect. With the Disease Trend function, users are able to read the prevalence, count, and average age etc. of any city in time scope. These functions, also provide a interface to compare data between different cities. Loading environment database dynamically, binding with Hybrid Bubble Chart function by observing the position and the radius change of the Bubbles at different time points and let users be able to feel whether is there any relative trend between environment attributes and the disease occurrence. We used Nation Health Insurance Research Database (NHIR) as the database of this system which contains medical records of a million patients. In order to deal with this enormous amount of patient data, we select MongoDB, which is a distributed document NoSQL database. With mapreduce technique we can run complicated operations. Eliminating those data which doesn’t fit the query condition, then restructure the data by geographical distribution. By using Cache system to keep our database away from busy accessing to increase the query efficiency. We also applied MVC framework to make this system more expendable and able to load specified prediction module or function module depend on the ICD-9 code user input. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3859 |
DOI: | 10.6342/NTU201601304 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 醫學工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-105-1.pdf | 2.41 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。