Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66813
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張原豪
dc.contributor.authorChen-Han Luen
dc.contributor.author呂承翰zh_TW
dc.date.accessioned2021-06-17T01:08:46Z-
dc.date.available2025-02-17
dc.date.copyright2020-02-17
dc.date.issued2020
dc.date.submitted2020-01-22
dc.identifier.citation[1] BREIDBACH, Christoph F.; RANJAN, Sasitharan. How do Fintech Service Platforms Facilitate Value Co-Creation? An Analysis of Twitter Data. In: ICIS. 2017.
[2] ZHOU, Tao. An empirical examination of continuance intention of mobile payment services. Decision support systems, 2013, 54.2: 1085-1091.
[3] BREIDBACH, Christoph F.; BRODIE, Roderick J. Engagement platforms in the sharing economy: conceptual foundations and research directions. Journal of Service Theory and Practice, 2017, 27.4: 761-777.
[4] CHUEN, David Lee Kuo; DENG, Robert H. Handbook of blockchain, digital finance, and inclusion: Cryptocurrency, FinTech, InsurTech, regulation, ChinaTech, mobile security, and distributed ledger. Academic Press, 2017.
[5] AGRESTI, Alan; KATERI, Maria. Categorical data analysis. Springer Berlin Heidelberg, 2011.
[6] CATTEDDU, Daniele. Cloud Computing: benefits, risks and recommendations for information security. In: Iberic Web Application Security Conference. Springer, Berlin, Heidelberg, 2009. p. 17-17.
[7] XIA, Feng, et al. Internet of things. International Journal of Communication Systems, 2012, 25.9: 1101.
[8] HERNÁNDEZ, Mauricio A.; STOLFO, Salvatore J. Real-world data is dirty: Data cleansing and the merge/purge problem. Data mining and knowledge discovery, 1998, 2.1: 9-37.
[9] SONAK, Apurva; PATANKAR, R. A. A survey on methods to handle imbalance dataset. Int. J. Comput. Sci. Mobile Comput, 2015, 4.11: 338-343.
[10] DAVIS, Jesse; GOADRICH, Mark. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning. ACM, 2006. p. 233-240.
[11] KATHAROPOULOS, Angelos; FLEURET, François. Not all samples are created equal: Deep learning with importance sampling. arXiv preprint arXiv:1803.00942, 2018.
[12] HERNANDEZ, Julio; CARRASCO-OCHOA, Jesús Ariel; MARTÍNEZ-TRINIDAD, José Francisco. An empirical study of oversampling and undersampling for instance selection methods on imbalance datasets. In: Iberoamerican Congress on Pattern Recognition. Springer, Berlin, Heidelberg, 2013. p. 262-269.
[13] GRZYMALA-BUSSE, Jerzy W.; HU, Ming. A comparison of several approaches to missing attribute values in data mining. In: International Conference on Rough Sets and Current Trends in Computing. Springer, Berlin, Heidelberg, 2000. p. 378-385.
[14] FRIEDL, Mark A.; BRODLEY, Carla E. Decision tree classification of land cover from remotely sensed data. Remote sensing of environment, 1997, 61.3: 399-409.
[15] BREIMAN, Leo. Random forests. Machine learning, 2001, 45.1: 5-32.
[16] LECUN, Yann, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86.11: 2278-2324.
[17] LIU, Xu-Ying; WU, Jianxin; ZHOU, Zhi-Hua. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008, 39.2: 539-550.
[18] ZOU, Kelly H.; TUNCALI, Kemal; SILVERMAN, Stuart G. Correlation and simple linear regression. Radiology, 2003, 227.3: 617-628.
[19] KING, Gary; ZENG, Langche. Logistic regression in rare events data. Political analysis, 2001, 9.2: 137-163.
[20] IANSITI, Marco; LAKHANI, Karim R. The truth about blockchain. Harvard Business Review, 2017, 95.1: 118-127.
[21] NAKAMOTO, Satoshi. Bitcoin: A peer-to-peer electronic cash system. Manubot, 2019.
[22] SOVBETOV, Yhlas. Factors influencing cryptocurrency prices: Evidence from bitcoin, ethereum, dash, litcoin, and monero. Journal of Economics and Financial Analysis, 2018, 2.2: 1-27.
[23] WOOD, Gavin, et al. Ethereum: A secure decentralised generalised transaction ledger. Ethereum project yellow paper, 2014, 151.2014: 1-32.
[24] HAFERKORN, Martin; DIAZ, Josué Manuel Quintana. Seasonality and interconnectivity within cryptocurrencies-an analysis on the basis of bitcoin, litecoin and namecoin. In: International Workshop on Enterprise Applications and Services in the Finance Industry. Springer, Cham, 2014. p. 106-120.
[25] ARMKNECHT, Frederik, et al. Ripple: Overview and outlook. In: International Conference on Trust and Trustworthy Computing. Springer, Cham, 2015. p. 163-180.
[26] CROSBY, Michael, et al. Blockchain technology: Beyond bitcoin. Applied Innovation, 2016, 2.6-10: 71.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66813-
dc.description.abstract近年來,金融科技提供了新形態的金融服務,給傳統的金融產業帶來了全面性的衝擊。保險業作為金融業的一部分,也因應了金融科技創新的潮流,提出了保險科技,包含了網路平台式的經營模式,與透過機器學習來分析數據。本篇研究將針對保險業中常見的資料集,車輛(包含機車與汽車)碰撞情況報告,使用機器學習方法進行分析,並預測車禍發生後傷害的嚴重程度。然而,在資料集當中比較少的死亡車禍案例,才是我們真正需要去在意的。在保險理賠中,死亡車禍會讓保險公司付出大量的金錢。因此,我們需要去進一步提升對於死亡車禍的預測。為了衡量我們預測的結果,本文中採用了Precision和Recall來取代Accuracy,著重在死亡車禍的判斷上。最後,我們會探討本研究在保險服務中的應用。zh_TW
dc.description.abstractIn recent years, financial technology has provided a novel form of financial services, which has brought a comprehensive impact to the traditional financial market. The insurance sector, which is an important part of the financial sector, has developed InsurTech, including digital finance and machine learning. This paper will focus on the data sets commonly used in the insurance sector, such as collision reports of vehicles (including motorcycles and cars). We use machine learning methods to analyze and predict the severity of injuries after a car accident. However, the few cases of fatal crash in the data set are what we really need to care about. In insurance claims, a fatal crash will cost the insurance company lots of money. Therefore, we need to further improve the prediction of the fatal crash. In order to measure the results of our predictions, Precision and Recall are used in this paper to replace Accuracy, focusing on the judgment of fatal crash. Finally, we will explore further application of this research in insurance services.en
dc.description.provenanceMade available in DSpace on 2021-06-17T01:08:46Z (GMT). No. of bitstreams: 1
ntu-109-R06946008-1.pdf: 1364657 bytes, checksum: 4b095c53651194c84a892a45a43fda78 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents口試委員會審定書 #
Acknowledgments i
ABSTRACT ii
中文摘要 iii
CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
Chapter 1 Introduction 1
Chapter 2 Related Work 8
2.1 Imbalanced Data Set 9
2.2 Performance measure 10
2.3 Sampling 11
2.3.1 Importance sampling 11
2.3.2 Undersampling. 11
2.3.3 Oversampling. 12
2.4 Unknown attribute values 13
2.5 One-Hot Encoding 13
Chapter 3 Methodology 15
3.1 Decision Tree 15
3.2 Random Forest 17
3.3 Deep learning 18
3.3.1 Convolutional Neural Network 18
3.3.2 Convolutional Neural Network with Sampling 18
3.4 Regression Analysis 19
3.4.1 Linear Regression 20
3.4.2 Logistic Regression 20
Chapter 4 Results 22
4.1 Results – Random Forest 22
4.2 Results – Deep learning 23
4.3 Results – Logistic Regression 24
Chapter 5 Conclusion 25
Chapter 6 Future Work 27
6.1 Traditional insurance policies will be replaced 27
6.2 Cryptocurrencies will be used to pay claims 28
REFERENCE 31
dc.language.isoen
dc.subject隨機森林zh_TW
dc.subject羅吉斯回歸zh_TW
dc.subject深度學習zh_TW
dc.subject不平衡的資料集zh_TW
dc.subject保險科技zh_TW
dc.subject金融科技zh_TW
dc.subjectdeep learningen
dc.subjectrandom foresten
dc.subjectImbalance data seten
dc.subjectFinTechen
dc.subjectInsurTechen
dc.subjectLogistic Regressionen
dc.title以機器學習方法解決保險理賠數據集不平衡之問題zh_TW
dc.titleMachine Learning Solutions for Imbalanced Data Set of Insurance Claimsen
dc.typeThesis
dc.date.schoolyear108-1
dc.description.degree碩士
dc.contributor.coadvisor崔茂培
dc.contributor.oralexamcommittee韓傳祥,鄭文皇,繆維中
dc.subject.keyword金融科技,保險科技,不平衡的資料集,隨機森林,深度學習,羅吉斯回歸,zh_TW
dc.subject.keywordFinTech,InsurTech,Imbalance data set,random forest,deep learning,Logistic Regression,en
dc.relation.page33
dc.identifier.doi10.6342/NTU201904448
dc.rights.note有償授權
dc.date.accepted2020-01-22
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資料科學學位學程zh_TW
顯示於系所單位:資料科學學位學程

文件中的檔案:
檔案 大小格式 
ntu-109-1.pdf
  未授權公開取用
1.33 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved