求職詐欺預測：應用集成學習

劉禹岑; Yu-Tsen Liu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87773

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林建甫	zh_TW
dc.contributor.advisor	Chien-Fu Lin	en
dc.contributor.author	劉禹岑	zh_TW
dc.contributor.author	Yu-Tsen Liu	en
dc.date.accessioned	2023-07-19T16:24:55Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-07-19	-
dc.date.issued	2023	-
dc.date.submitted	2023-03-21	-
dc.identifier.citation	Abril, D. (2022), “Fake job postings are stealing applicants’ money and identities,” URL: https://www.washingtonpost.com/technology/2022/12/22/job-posting-scam-tips/. Alghamdi, B. & Alharby, F. (2019), “An Intelligent Model for Online Recruitment Fraud Detection,” Journal of Information Security 10, 155–176. Anita, C., Nagarajan, P., Sairam, G., Ganesh, P. & Deepakkumar, G. (2021), “Fake Job Detection and Analysis Using Machine Learning and Deep Learning Algorithms,” Revista Gestão Inovação e Tecnologias 11, 642–650. Athey, S. (2019), “The Impact of Machine Learning on Economics,” The Economics of Artificial Intelligence pp. 507–547. Athey, S. & Imbens, G. W. (2019), “Machine Learning Methods That Economists Should Know About,” Annual Review of Economics 11(1), 685–725. Bandyopadhyay, S. & Dutta, S. (2020), “Fake Job Recruitment Detection Using Machine Learning Approach,” International Journal of Engineering Trends and Technology 68, 48–53. Chawla, N., Bowyer, K., Hall, L. & Kegelmeyer, W. (2002), “SMOTE: Synthetic Minority Over-sampling Technique,” J. Artif. Intell. Res. (JAIR) 16, 321–357. Chen, T.-Q. & Guestrin, C. (2016), “XGBoost: A Scalable Tree Boosting System,” CoRR abs/1603.02754. Dorogush, A. V., Gulin, A., Gusev, G., Kazeev, N., Prokhorenkova, L. O. & Vorobev, A. (2017), “Fighting biases with dynamic boosting,” CoRR abs/1706.09516. Faggian, A. (2014), “Job Search Theory,” Handbook of Regional Science pp. 59–73. FTC (2020), “Job scams. Federal Trade Commission Consumer Information,” URL: https://consumer.ftc.gov/articles/job-scams. Lal, S., Jiaswal, R., Sardana, N., Verma, A., Kaur, A. & Mourya, R. (2019), “ORFDetector: Ensemble Learning Based Online Recruitment Fraud Detection,” Twelfth International Conference on Contemporary Computing (IC3) pp. 1–5. Mahbub, S. & Pardede, E. (2018), “Using Contextual Features for Online Recruitment Fraud Detection,” Designing Digitalization (ISD2018 Proceedings). Lund, Sweden: Lund University. McCall, J. (1970), “Economics of Information and Job Search,” The Quarterly Journal of Economics 84, 113–126. Mehboob, A. & Malik, M. S. (2020), “Smart Fraud Detection Framework for Job Recruitments,” Arabian Journal for Science and Engineering 46. Nasser, I. M., Alzaanin, A. H. & Maghari, A. Y. (2021), “Online Recruitment Fraud Detection using ANN,” 2021 Palestinian International Conference on Information and Communication Technology (PICICT) pp. 13–17. Naudé, M., Adebayo, K. & Nanda, R. (2022), “A machine learning approach to detecting fraudulent job types,” AI & SOCIETY. SEC (2022), “Public Alert: Unregistered Soliciting Entities (PAUSE),” URL: https://www.sec.gov/enforce/public-alerts. Vidros, S., Kolias, C. & Kambourakis, G. (2016), “Online recruitment services: Another playground for fraudsters,” Computer Fraud & Security 2016, 8–13. Vidros, S., Kolias, C., Kambourakis, G. & Akoglu, L. (2017), “Automatic Detection of Online Recruitment Frauds: Characteristics, Methods, and a Public Dataset,” Future Internet 9, 6. Vo, M., Vo, A., Nguyen, T., Sharma, R. & Le, T. (2021), “Dealing with the Class Imbalance Problem in the Detection of Fake Job Descriptions”, Computers, Materials and Continua 68, 521–535.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87773	-
dc.description.abstract	求職詐欺包含透過設立假職缺蒐集個人資訊、銀行帳戶密碼或是取得金錢，本文主要探討假職缺的特徵與集成學習是否可以顯著提升預測表現，應用六種機器學習分類模型，以及職缺文字敘述、數值與虛擬變數等觀察特徵，預測樣本資料為真實或是虛假職缺。實證結果顯示，結合邏輯迴歸、K近鄰演算法、隨機森林三個子模型的集成模型表現最佳；另外，本文計算特徵重要性篩選出期望薪資平均、職缺和公司相關資訊敘述長度、職缺公告中是否含有公司商標等皆為預測求職詐欺的關鍵指標。	zh_TW
dc.description.abstract	Job scams are fraudulent job advertisements that aim to steal personal information, banking details, or money from unsuspecting job seekers. In this article, we will be discussing the key characteristics of fake job postings and examining whether ensemble learning methods can significantly improve the performance of machine learning models in identifying job scams. We applied six different machine learning algorithms to predict fraudulent job postings using both textual and numerical variables. Our results show that the ensemble learning model, which combined logistic regression, KNeighbors classifier, and random forest classifier, performed the best. Furthermore, we used a framework based on Gini impurity to identify the ten most important factors in the random forest classifier, including average salary, company profile length, and whether the job posting had a company logo.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-07-19T16:24:55Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-07-19T16:24:55Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	致謝 i 摘要 ii Abstract iii 目錄 iv 圖目錄 vi 表目錄 viii 第一章緒論 1 第二章文獻回顧 4 2.1 預測求職詐欺 5 2.2 機器學習與資料平衡 7 第三章實證模型 9 3.1 集成學習 Ensemble Learning 9 3.2 邏輯迴歸 Logistics Regression 11 3.3 K-近鄰演算法 KNeighbors Classifier 13 3.4 隨機森林 Random Forest Classifier 14 3.5 極限梯度提升 XGBoost Classifier 15 3.6 Catboost Classifier 18 第四章資料敘述與處理 19 4.1 資料敘述與視覺化 19 4.2 文字探勘 33 4.3 應用虛擬變數 35 第五章實證結果 44 5.1 模型指標 44 5.2 模型訓練結果 47 5.2.1 第一部分-文字特徵 49 5.2.2 第二部分-數值特徵 51 5.2.3 第三部分-綜合特徵 53 第六章結論 56 參考文獻 58 附錄-模型分數 61	-
dc.language.iso	zh_TW	-
dc.subject	特徵工程	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	集成學習	zh_TW
dc.subject	詐欺預測	zh_TW
dc.subject	資料平衡	zh_TW
dc.subject	Feature Engineering	en
dc.subject	Ensemble Learning	en
dc.subject	Scam Detection	en
dc.subject	Machine Learning	en
dc.subject	Data Balance	en
dc.title	求職詐欺預測：應用集成學習	zh_TW
dc.title	Job Scam Detection: An Application of Ensemble Learning	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	樊家忠	zh_TW
dc.contributor.coadvisor	Elliott Fan	en
dc.contributor.oralexamcommittee	吳中書;翁永和	zh_TW
dc.contributor.oralexamcommittee	Chung-Shu Wu;Yungho Weng	en
dc.subject.keyword	機器學習,集成學習,詐欺預測,特徵工程,資料平衡,	zh_TW
dc.subject.keyword	Machine Learning,Ensemble Learning,Scam Detection,Feature Engineering,Data Balance,	en
dc.relation.page	63	-
dc.identifier.doi	10.6342/NTU202300672	-
dc.rights.note	未授權	-
dc.date.accepted	2023-03-21	-
dc.contributor.author-college	社會科學院	-
dc.contributor.author-dept	經濟學系	-
顯示於系所單位：	經濟學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	1.93 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。