請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89098完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 許永真 | zh_TW |
| dc.contributor.advisor | Yung-Jen Hsu | en |
| dc.contributor.author | 鄒咏霖 | zh_TW |
| dc.contributor.author | Yung-Lin Tsou | en |
| dc.date.accessioned | 2023-08-16T17:07:36Z | - |
| dc.date.available | 2023-11-09 | - |
| dc.date.copyright | 2023-08-16 | - |
| dc.date.issued | 2023 | - |
| dc.date.submitted | 2023-08-08 | - |
| dc.identifier.citation | Patwa, P., Sharma, S., Pykl, S., Guptha, V., Kumari, G., Akhtar, M. S., Ekbal, A., Das, A., & Chakraborty, T. (2021). Fighting an Infodemic: COVID-19 Fake News Dataset. In Combating Online Hostile Posts in Regional Languages during Emergency Situation (pp. 21–29). Springer International Publishing. https://doi.org/10.1007/978-3-030-73696-5_3
Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: a Large-scale Dataset for Fact Extraction and VERification. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 809–819. https://doi.org/10.18653/v1/N18-1074 Thorne, J., Vlachos, A., Cocarascu, O., Christodoulopoulos, C., & Mittal, A. (2019). The FEVER2.0 Shared Task. Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), 1–6. https://doi.org/10.18653/v1/D19-6601 Hassan, N., Arslan, F., Li, C., & Tremayne, M. (2017). Toward Automated Fact-Checking: Detecting Check-Worthy Factual Claims by ClaimBuster. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1803–1812. https://doi.org/10.1145/3097983.3098131 Lawrence, J., & Reed, C. (2019). Argument Mining: A Survey. Computational Linguistics, 45(4), 765–818. https://doi.org/10.1162/coli_a_00364 Toulmin, S. E. (2003). The Uses of Argument (2nd ed.). Cambridge University Press. Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(4), 623–656. https://doi.org/10.1002/j.1538-7305.1948.tb00917. x Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A Neural Probabilistic Language Model. J. Mach. Learn. Res., 3(null), 1137–1155. Elman, J. (1993). Distributed Representations, Simple Recurrent Networks, And Grammatical Structure. Machine Learning, 7. https://doi.org/10.1007/BF00114844 Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2020). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (1st ed.). Prentice Hall PTR. Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley FrameNet Project. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1, 86–90. https://doi.org/10.3115/980845.980860 Fei, H., Wu, S., Ren, Y., Li, F., & Ji, D. (2021). Better Combine Them Together! Integrating Syntactic Constituency and Dependency Representations for Semantic Role Labeling. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 549–559. https://doi.org/10.18653/v1/2021.findings-acl.49 Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & others. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9 Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners. Nakov, P., Corney, D., Hasanain, M., Alam, F., Elsayed, T., Barrón-Cedeño, A., Papotti, P., Shaar, S., & Da San Martino, G. (2021). Automated Fact-Checking for Assisting Human Fact-Checkers. In Z.-H. Zhou (Ed.), Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21 (pp. 4551–4558). International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2021/619 Gupta, S., Singh, P., Sundriyal, M., Akhtar, Md. S., & Chakraborty, T. (2021). LESA: Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 3178–3188. https://doi.org/10.18653/v1/2021.eacl-main.277 Wührl, A., & Klinger, R. (2021). Claim Detection in Biomedical Twitter Posts. Reddy, R. G., Chetan, S., Wang, Z., Fung, Y. R., Conger, K., Elsayed, A., Palmer, M., Nakov, P., Hovy, E., Small, K., & Ji, H. (2022). NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge. Daxenberger, J., Eger, S., Habernal, I., Stab, C., & Gurevych, I. (2017). What is the Essence of a Claim? Cross-Domain Claim Identification. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2055–2066. https://doi.org/10.18653/v1/D17-1218 Aharoni, E., Polnarov, A., Lavee, T., Hershcovich, D., Levy, R., Rinott, R., Gutfreund, D., & Slonim, N. (2014). A Benchmark Dataset for Automatic Detection of Claims and Evidence in the Context of Controversial Topics. Proceedings of the First Workshop on Argumentation Mining, 64–68. https://doi.org/10.3115/v1/W14-2109 Levy, R., Bilu, Y., Hershcovich, D., Aharoni, E., & Slonim, N. (2014). Context Dependent Claim Detection. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 1489–1500. https://aclanthology.org/C14-1141 Lippi, M., & Torroni, P. (2015). Context-Independent Claim Detection for Argument Mining. Proceedings of the 24th International Conference on Artificial Intelligence, 185–191. Savchev, A. (2022). AI Rational at CheckThat! 2022: Using transformer models for tweet classification. Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, 656–659. http://ceur-ws.org/Vol-3180/#paper-52 Gullal S. Cheema, R. E., Sherzod Hakimov. (2020). Check_square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features. Proceedings of the Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum. http://ceur-ws.org/Vol-2696/#paper-216 Nakov, P., Barrón-Cedeño, A., Da San Martino, G., Alam, F., Miguez, R., Caselli, T., Kutlu, M., Zaghouani, W., Li, C., Shaar, S., Mubarak, H., Nikolov, A., & Kartal, Y. S. (2022). Overview of the CLEF-2022 CheckThat! Lab Task 1 on Identifying Relevant Claims in Tweets. Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, 368–392. http://ceur-ws.org/Vol-3180/#paper-28 Sundriyal, M., Kulkarni, A., Pulastya, V., Akhtar, Md. S., & Chakraborty, T. (2022). Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 7701–7715. https://aclanthology.org/2022.emnlp-main.525 Alam, F., Shaar, S., Dalvi, F., Sajjad, H., Nikolov, A., Mubarak, H., Da San Martino, G., Abdelali, A., Durrani, N., Darwish, K., Al-Homaid, A., Zaghouani, W., Caselli, T., Danoe, G., Stolk, F., Bruntink, B., & Nakov, P. (2021). Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society. Findings of the Association for Computational Linguistics: EMNLP 2021, 611–649. https://doi.org/10.18653/v1/2021.findings-emnlp.56 Barron-Cedeno, A., Elsayed, T., Nakov, P., Da San Martino, G., Hasanain, M., Suwaileh, R., & Haouari, F. (2020). CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media. Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2023). Large Language Models Are Human-Level Prompt Engineers. Ben-David, E., Oved, N., & Reichart, R. (2022). PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains. Transactions of the Association for Computational Linguistics, 10, 414–433. https://doi.org/10.1162/tacl_a_00468 Jiang, Z., Xu, F. F., Araki, J., & Neubig, G. (2020). How Can We Know What Language Models Know? Transactions of the Association for Computational Linguistics, 8, 423–438. https://doi.org/10.1162/tacl_a_00324 Gao, T., Fisch, A., & Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295 Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E., & Singh, S. (2020). AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346 Rao, S., & Tetreault, J. (2018). Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 129–140. https://doi.org/10.18653/v1/N18-1012 Jin, D., Jin, Z., Hu, Z., Vechtomova, O., & Mihalcea, R. (2022). Deep Learning for Text Style Transfer: A Survey. Computational Linguistics, 48(1), 155–205. https://doi.org/10.1162/coli_a_00426 Madaan, A., Setlur, A., Parekh, T., Poczos, B., Neubig, G., Yang, Y., Salakhutdinov, R., Black, A. W., & Prabhumoye, S. (2020). Politeness Transfer: A Tag and Generate Approach. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 1869–1881. https://doi.org/10.18653/v1/2020.acl-main.169 Nogueira dos Santos, C., Melnyk, I., & Padhi, I. (2018). Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 189–194. https://doi.org/10.18653/v1/P18-2031 | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89098 | - |
| dc.description.abstract | 隨著網際網路技術的進步,假信息的議題日益嚴重。在眾多處理假信息的方法中,事實查核被視為最值得信任的手段。然而,傳統的事實查核通過人力驗證信息真偽的過程可能耗時過長,無法趕上信息產生的速度,因此事實查核的自動化成為了一個極具價值的目標。自動化事實查核通常分為四個階段,包括宣稱檢測、已驗證的宣稱檢索、證據檢索、與宣稱驗證。
本研究著重於解決推文中的宣稱檢測問題。宣稱檢測的目標是確定哪部分內容是宣稱,而一個宣稱可被定義為「宣稱某事是事實」,驗證這些宣稱的真偽即為事實查核的過程。在這項研究中,我們首先提出了過去在推文中進行宣稱檢測時兩個未被充分處理的問題:推文的混亂格式與任務定義,並提出了相應的方法來解決這些問題。此外,我們也透過實驗去驗證我提出的方法的有效性,並總結了實驗中的啟示與發現。 | zh_TW |
| dc.description.abstract | With the advancement of Internet technology, the issue of misinformation is increasingly serious. Among the many methods of dealing with misinformation, fact-checking is considered the most trustworthy means. However, the traditional fact-checking process, which verifies the authenticity of information manually, may be too time-consuming to keep up with the speed of information generation. Thus, the automation of fact-checking has become a highly valuable goal. Automated fact-checking is generally divided into four stages, including claim detection, verified claim retrieval, evidence retrieval, and claim verification.
This study focuses on solving claim detection in tweets. The goal of claim detection is to determine which part of the content is a claim, and a claim can be defined as "an assertion of something as a fact". Verifying the truth of these claims is the process of fact-checking. In this research, we first propose two issues that have not been adequately addressed in claim detection within tweets : tweet's noisy format and task definition, and propose corresponding methods to solve these problems. In addition, we also validate the effectiveness of our proposed methods through experiments and summarize the insights and findings from the experiments. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-16T17:07:36Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2023-08-16T17:07:36Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgments i
Abstract ii 摘要 iii List of Figures vii List of Tables x Chapter 1 Introduction 1 Chapter 2 Related Work 7 2.1 Style-Based Misinformation Detection 7 2.2 Automatic Fact Checking 8 2.3 Claim Detection 8 2.4 Language Model(LM) 9 2.5 Semantic Role Labeling(SRL) 10 2.6 Bidirectional Encoder Representations from Transformers(BERT) 11 2.7 Generative Pre-trained Transformer(GPT) 12 2.8 Text Style Transfer 13 Chapter 3 Task Definition, Dataset & Current Approach 14 3.1 Task Definition 14 3.2 Article-Level Dataset 15 3.3 Sentence-Level Dataset 16 3.4 Current Approach 18 Chapter 4 Research Questions 22 4.1 Tweet’s Format 22 4.2 Improper Task Definition 24 Chapter 5 Methodology 26 5.1 Reducing Noise in the Formatting of Tweets 27 5.2 Reducing the Inconsistency between the Level of Input and Claim 29 5.3 Combine All 33 Chapter 6 Experiments 38 6.1 Experimental Setting & Evaluation Metric 38 6.2 Elimination of Special Symbols and URLs 39 6.3 Different Prompt when Rewriting Tweets 43 6.4 Split Tweets into Sentence-Level 48 6.5 Train with Sentence-Level Dataset 52 6.6 Combined All 55 Chapter 7 Conclusion 59 Bibliography 61 | - |
| dc.language.iso | en | - |
| dc.subject | 宣稱檢測 | zh_TW |
| dc.subject | 自動化事實查核 | zh_TW |
| dc.subject | 自然語言處理 | zh_TW |
| dc.subject | Automatic Fact-checking | en |
| dc.subject | Claim Detection | en |
| dc.subject | Natural Language Processing | en |
| dc.title | 推文中的宣稱檢測 | zh_TW |
| dc.title | Claim Detection in Tweets | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 111-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 蔡宗翰;郭彥伶;古倫維;李育杰 | zh_TW |
| dc.contributor.oralexamcommittee | Tzong-Han Tsai;Yen-Ling Kuo;Lun-Wei Ku;Yuh-Jye Lee | en |
| dc.subject.keyword | 自動化事實查核,宣稱檢測,自然語言處理, | zh_TW |
| dc.subject.keyword | Automatic Fact-checking,Claim Detection,Natural Language Processing, | en |
| dc.relation.page | 65 | - |
| dc.identifier.doi | 10.6342/NTU202302992 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2023-08-09 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-111-2.pdf | 2.39 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
