基於大型語言模型在法律判決預測之分析與應用

彭議霆; Yi-Ting Peng

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96588

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	雷欽隆	zh_TW
dc.contributor.advisor	Chin-Laung Lei	en
dc.contributor.author	彭議霆	zh_TW
dc.contributor.author	Yi-Ting Peng	en
dc.date.accessioned	2025-02-19T16:39:25Z	-
dc.date.available	2025-02-20	-
dc.date.copyright	2025-02-19	-
dc.date.issued	2025	-
dc.date.submitted	2025-01-15	-
dc.identifier.citation	[1] A. Agarwal, S. Xu, and M. Grabmair. Extractive summarization of legal decisions using multi-task learning and maximal marginal relevance. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 1857–1872, Abu Dhabi, United Arab Emirates, Dec. 2022. [2] K. Ameri, M. Hempel, H. Sharif, J. Lopez Jr., and K. Perumalla. Cybert: Cybersecurity claim classification by fine-tuning the bert language model. Journal of Cybersecurity and Privacy, 1(4):615–637, 2021. [3] Y. Bengio, R. Ducharme, and P. Vincent. A neural probabilistic language model. In Advances in Neural Information Processing Systems, volume 13. MIT Press, 2000. [4] L. K. Branting, C. B. Callaway, B. W. Mott, and J. C. Lester. Integrating discourse and domain knowledge for document drafting. In Proceedings of the 7th International Conference on Artificial Intelligence and Law, ICAIL ’99, page 214–220, New York, NY, USA, 1999. [5] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020. [6] P. Butt. Modern Legal Drafting: A Guide to Using Clearer Language. Cambridge University Press, 3 edition, 2013. [7] I. Chalkidis, E. Fergadiotis, P. Malakasiotis, N. Aletras, and I. Androutsopoulos. Extreme multi-label legal text classification: A case study in eu legislation. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 78–87, Minneapolis, Minnesota, June 2019. [8] I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, and I. Androutsopoulos. Legal-bert: The muppets straight out of law school. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2898–2904, Online, Nov. 2020. [9] K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning. Electra: Pre-training text encoders as discriminators rather than generators. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 2020. [10] P. Colombo, T. P. Pires, M. Boudiaf, D. Culver, R. Melo, C. Corro, A. F. T. Martins, F. Esposito, V. L. Raposo, S. Morgado, and M. Desa. Saullm-7b: A pioneering large language model for law. arXiv preprint arXiv:2403.03883v2, 2024. [11] F. Contini. Unboxing generative ai for the legal professions: Functions, impacts and governance. International Journal for Court Administration, Aug 2024. [12] J. Cui, M. Ning, Z. Li, B. Chen, Y. Yan, H. Li, B. Ling, Y. Tian, and L. Yuan. Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model. arXiv preprint arXiv:2306.16092, May 2024. [13] J. Cui, X. Shen, and S. Wen. A survey on legal judgment prediction: Datasets, metrics, models and challenges. IEEE Access, 11:102050–102071, 2023. [14] J. Deng, X. Cao, and B. Cheng. Research on generating cultural relic images based on a low-rank adaptive diffusion model. In Proceedings of the 2024 Guangdong-Hong Kong-Macao Greater Bay Area International Conference on Digital Economy and Artificial Intelligence, DEAI ’24, page 629–634, New York, NY, USA, 2024. [15] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. [16] R. El Hamdani, T. Bonald, F. D. Malliaros, N. Holzenberger, and F. Suchanek. The factuality of large language models in the legal domain. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, CIKM ’24, page 3741–3746, New York, NY, USA, 2024. [17] Y. Feng, C. Li, and V. Ng. Legal judgment prediction: A survey of the state of the art. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 5461–5469, 7 2022. [18] Y. Feng, C. Li, and V. Ng. Legal judgment prediction via event extraction with constraints. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 648–664, Dublin, Ireland, May 2022. [19] L. Gan, K. Kuang, Y. Yang, and F. Wu. Judgment prediction via injecting legal knowledge into neural networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14):12866–12874, May 2021. [20] J. Gao and C.-Y. Lin. Introduction to the special issue on statistical language modeling. ACM Transactions on Asian Language Information Processing, 3(2):87–93, jun 2004. [21] G. Governatori, T. Bench-Capon, B. Verheij, M. Araszkiewicz, E. Francesconi, and M. Grabmair. Thirty years of artificial intelligence and law: the first decade. Artificial Intelligence and Law, 30(14):481–519, December 2022. [22] P. Hacker, A. Engel, and M. Mauer. Regulating chatgpt and other large generative ai models. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23, page 1112–1123, New York, NY, USA, 2023. [23] C. He, T.-P. Tan, S. Xue, and Y. Tan. Explaining legal judgments: A multitask learning framework for enhancing factual consistency in rationale generation. Journal of King Saud University - Computer and Information Sciences, 35(10):101868, 2023. [24] HFL. chinese-alpaca [machine learning model]. https://huggingface.co/hfl, 2024. Accessed: June 15, 2024. [25] Y.-X. Hong and C.-H. Chang. Improving colloquial case legal judgment prediction via abstractive text summarization. Computer Law & Security Review, 51:105863, 2023. [26] C. Hoyle. Unsafe Convictions in Capital Cases in Taiwan. The Death Penalty Project, London, 2019. [27] C. H. Hsu, G. Tan, and B. Stantic. A fine-tuned tourism-specific generative ai concept. Annals of Tourism Research, 104:103723, 2024. [28] E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. [29] W. Hua, Y. Zhang, Z. Chen, J. Li, and M. Weber. Mixed-domain language modeling for processing long legal documents. In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 51–61, Singapore, Dec. 2023. [30] J. Huang, S. S. Gu, L. Hou, Y. Wu, X. Wang, H. Yu, and J. Han. Large language models can self-improve. arXiv preprint arXiv:2210.11610, October 2022. [31] Q. Huang, M. Tao, C. Zhang, Z. An, C. Jiang, Z. Chen, Z. Wu, and Y. Feng. Lawyer llama: Enhancing llms with legal knowledge. arXiv preprint arXiv:2305.15062v2, 2023. [32] W. Hwang, D. Lee, K. Cho, H. Lee, and M. Seo. A multi-task benchmark for korean legal language understanding and judgement prediction. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY, USA, 2024. [33] D. K. Ilias Chalkidis. Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artificial Intelligence and Law, 27(4):171–198, 2019. [34] K. Janocha and W. Czarnecki. On loss functions for deep neural networks in classification. Schedae Informaticae, 25, 02 2017. [35] D. M. Katz, D. Hartung, L. Gerlach, A. Jana, and M. J. Bommarito. Natural language processing in the legal domain. arXiv preprint arXiv:2302.12039, February 2023. [36] A. Krausová. Intersections between law and artificial intelligence. International Journal of Computer (IJC), 27(1):55–68, 2017. [37] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut. Albert: A lite bert for self-supervised learning of language representations. In International Conference on Learning Representations, 2020. [38] D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y. Jiang, C. Chen, T. Wu, K. Shu, L. Cheng, and H. Liu. From generation to judgment: Opportunities and challenges of llm-as-a-judge. arXiv preprint arXiv:2411.16594, January 2025. [39] H. Li and W. Lu. Mixed cross entropy loss for neural machine translation. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 6425–6436. PMLR, 18–24 Jul 2021. [40] H. Li, Y. Ma, Z. Ma, and H. Zhu. Weibo text sentiment analysis based on bert and deep learning. Applied Sciences, 11(22), 2021. [41] B. Lin, Z. Tang, Y. Ye, J. Huang, J. Zhang, Y. Pang, P. Jin, M. Ning, J. Luo, and L. Yuan. Moe-llava: Mixture of experts for large vision-language models. arXiv preprint arXiv:2401.15947, December 2024. [42] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81. Association for Computational Linguistics, July 2004. [43] Y. Lin. Llama-3-taiwan-8b-instruct [machine learning model]. https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct, 2024. Accessed: July 25, 2024. [44] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, July 2019. [45] Y. Liu, Y. Wu, Y. Zhang, C. Sun, W. Lu, F. Wu, and K. Kuang. Ml-ljp: Multi-law aware legal judgment prediction. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’23, page 1023–1034, New York, NY, USA, 2023. [46] B. Luo, Y. Feng, J. Xu, X. Zhang, and D. Zhao. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2727–2736, Copenhagen, Denmark, Sept. 2017. [47] M. Markovi and S. Gostoji. Legal document assembly system for introducing law students with legal drafting. Artificial Intelligence and Law, 31(4):829–863, Dec 2023. [48] M. Masala, R. C. A. Iacob, A. S. Uban, M. Cidota, H. Velicu, T. Rebedea, and M. Popescu. jurBERT: A Romanian BERT model for legal judgement prediction. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 86–94, Punta Cana, Dominican Republic, Nov. 2021. [49] M. Medvedeva and P. Mcbride. Legal judgment prediction: If you are going to do it, do it right. In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 73–84, Singapore, Dec. 2023. [50] Meta-Llama. Meta-llama [machine learning model]. https://huggingface.co/meta-llama, 2024. Accessed: May 2, 2024. [51] H. Naveed, A. U. Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, and A. Mian. A comprehensive overview of large language models. arXiv preprint arXiv:2307.06435v10, October 2024. [52] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike, and R. Lowe. Training language models to follow instructions with human feedback. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS’22, Red Hook, NY, USA, 2024. [53] S. Paul, P. Goyal, and S. Ghosh. Lesicin: A heterogeneous graph-based approach for automatic legal statute identification from indian legal documents. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):11139–11146, Jun. 2022. [54] C. Peng, X. Yang, A. Chen, K. E. Smith, N. PourNejatian, A. B. Costa, C. Martin, M. G. Flores, Y. Zhang, T. Magoc, G. Lipori, D. A. Mitchell, N. S. Ospina, M. M. Ahmed, W. R. Hogan, E. A. Shenkman, Y. Guo, J. Bian, and Y. Wu. A study of generative large language model for medical research and healthcare. npj Digital Medicine, 6(1):210, Nov 2023. [55] R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: your language model is secretly a reward model. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Red Hook, NY, USA, 2024. [56] L. Rasmy, Y. Xiang, Z. Xie, C. Tao, and D. Zhi. Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. npj Digital Medicine, 4(1):86, May 2021. [57] P. P. Ray. Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3:121–154, 2023. [58] T. Santosh, S. Xu, O. Ichim, and M. Grabmair. Deconfounding legal judgment prediction for european court of human rights cases towards better alignment with experts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1120–1138, Abu Dhabi, United Arab Emirates, Dec. 2022. [59] J. Savelka, A. Agarwal, C. Bogart, Y. Song, and M. Sakr. Can generative pre-trained transformers gpt pass assessments in higher education programming courses? In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, ITiCSE 2023. ACM, June 2023. [60] J. Savelka, K. D. Ashley, M. A. Gray, H. Westermann, and H. Xu. Can gpt-4 support analysis of textual data in tasks requiring highly specialized domain expertise? In Proceedings of the Sixth Workshop on Automated Semantic Analysis of Information in Legal Text (ASAIL 2023), ASAIL ’23, pages 1–12, Braga, Portugal, 2023. [61] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. [62] Y. Shao, C. Hardmeier, and J. Nivre. Universal word segmentation: Implementation and interpretation. Transactions of the Association for Computational Linguistics, 6:421–435, 2018. [63] W. Shen, X. Zhang, Y. Yao, R. Zheng, H. Guo, and Y. Liu. Improving reinforcement learning from human feedback using contrastive rewards. arXiv preprint arXiv:2403.07708, March 2024. [64] R. Shui, Y. Cao, X. Wang, and T.-S. Chua. A comprehensive evaluation of large language models on legal judgment prediction. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7337–7348, Singapore, Dec. 2023. [65] K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, P. Payne, M. Seneviratne, P. Gamble, C. Kelly, A. Babiker, N. Schärli, A. Chowdhery, P. Mansfield, D. Demner-Fushman, B. Agüera y Arcas, D. Webster, G. S. Corrado, Y. Matias, K. Chou, J. Gottweis, N. Tomasev, Y. Liu, A. Rajkomar, J. Barral, C. Semturs, A. Karthikesalingam, and V. Natarajan. Large language models encode clinical knowledge. Nature, 620(7972):172–180, Aug 2023. [66] N. Stiennon, L. Ouyang, J. Wu, D. Ziegler, R. Lowe, C. Voss, A. Radford, D. Amodei, and P. F. Christiano. Learning to summarize with human feedback. In Advances in Neural Information Processing Systems, volume 33, pages 3008–3021, 2020. [67] C. Sun, X. Qiu, Y. Xu, and X. Huang. How to fine-tune BERT for text classification? CoRR, abs/1905.05583, 2019. [68] Z. Sun. A short survey of viewing large language models in legal aspect. arXiv preprint arXiv:2303.09136, March 2023. [69] H. Surden. Artificial intelligence and law: An overview. Georgia State University Law Review, 35(4):1305–1337, 2019. [70] TAIDE. Taide-lx-7b [machine learning model]. https://huggingface.co/taide/TAIDE-LX-7Bi, 2024. Accessed: May 10, 2024. [71] I. Timiryasov and J.-L. Tastet. Baby llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, pages 279–289, Singapore, Dec. 2023. [72] S. T.y.s.s, O. Ichim, and M. Grabmair. Zero-shot transfer of article-aware legal outcome classification for european court of human rights cases. In Findings of the Association for Computational Linguistics: EACL 2023, pages 605–617, Dubrovnik, Croatia, May 2023. [73] M. Vakili, M. Ghamsari, and M. Rezaei. Performance analysis and comparison of machine and deep learning algorithms for iot data classification. CoRR, abs/2001.09636, 2020. [74] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30, 2017. [75] Z. Wan, Y. Zhang, Y. Wang, F. Cheng, and S. Kurohashi. Reformulating domain adaptation of large language models as adapt-retrieve-revise: A case study on Chinese legal domain. In Findings of the Association for Computational Linguistics: ACL 2024, pages 5030–5041, Bangkok, Thailand, Aug. 2024. [76] S. Welleck, J. Weston, A. Szlam, and K. Cho. Dialogue natural language inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3731–3741, Florence, Italy, July 2019. [77] M. Weller-Di Marco and A. Fraser. Analyzing the understanding of morphologically complex words in large language models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 1009–1020, Torino, Italia, May 2024. [78] S. Wu, O. İrsoy, S. Lu, V. Dabravolski, M. Dredze, S. Gehrmann, P. Kambadur, D. Rosenberg, and G. Mann. Bloomberggpt: A large language model for finance. arXiv preprint arXiv:2303.17564, December 2023. [79] Y. Wu, Y. Liu, W. Lu, Y. Zhang, J. Feng, C. Sun, F. Wu, and K. Kuang. Towards interactivity and interpretability: A rationale-based legal judgment prediction framework. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4787–4799, Abu Dhabi, United Arab Emirates, Dec. 2022. [80] Y. Wu, S. Zhou, Y. Liu, W. Lu, X. Liu, Y. Zhang, C. Sun, F. Wu, and K. Kuang. Precedent-enhanced legal judgment prediction with llm and domain-model collaboration. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12060–12075, Singapore, Dec. 2023. [81] C. Xiao, H. Zhong, Z. Guo, C. Tu, Z. Liu, M. Sun, Y. Feng, X. Han, Z. Hu, H. Wang, and J. Xu. Cail2018: A large-scale legal dataset for judgment prediction. CoRR, abs/1807.02478, 2018. [82] W. Yang, W. Jia, X. Zhou, and Y. Luo. Legal judgment prediction via multiperspective bi-feedback network. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 4085–4091. International Joint Conferences on Artificial Intelligence Organization, 7 2019. [83] Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov, and Q. V. Le. Xl-net: Generalized autoregressive pretraining for language understanding. CoRR, abs/1906.08237, 2019. [84] M. Zaib, Q. Z. Sheng, and W. Emma Zhang. A short survey of pre-trained language models for conversational ai-a new age in nlp. In Proceedings of the Australasian Computer Science Week Multiconference, ACSW ’20, New York, NY, USA, 2020. [85] X. Zhai. Chatgpt user experience: Implications for education. SSRN Electronic Journal, 2023. [86] L. Zheng, N. Guha, B. R. Anderson, P. Henderson, and D. E. Ho. When does pre-training help? assessing self-supervised learning for law and the casehold dataset of 53,000+ legal holdings. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, ICAIL ’21, page 159–168, New York, NY, USA, 2021. [87] R. Zheng, S. Dou, S. Gao, Y. Hua, W. Shen, B. Wang, Y. Liu, S. Jin, Q. Liu, Y. Zhou, L. Xiong, L. Chen, Z. Xi, N. Xu, W. Lai, M. Zhu, C. Chang, Z. Yin, R. Weng, W. Cheng, H. Huang, T. Sun, H. Yan, T. Gui, Q. Zhang, X. Qiu, and X. Huang. Secrets of rlhf in large language models: Part i: Ppo. arXiv preprint arXiv:2307.04964, July 2023. [88] H. Zhong, C. Xiao, C. Tu, T. Zhang, Z. Liu, and M. Sun. How does nlp benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5218–5230, Online, July 2020. [89] D. M. Ziegler, N. Stiennon, J. Wu, T. B. Brown, A. Radford, D. Amodei, P. Christiano, and G. Irving. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593, January 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96588	-
dc.description.abstract	隨著人工智慧技術的迅速發展，各專業領域對其應用不斷推陳出新，涵蓋機器學習、深度學習及大型語言模型（Large Language Models, LLMs）等多個層面。在電腦計算資源顯著提升的推動下，人工智慧技術呈現出驚人的發展速度。本研究針對深度學習技術與大型語言模型在法律審判與判決書生成中的應用進行探討，旨在驗證其可行性並評估實際效能。近年來，隨著國民法官制度的正式實施，審判工作從過去僅由專業法官主導轉變為結合一般民眾與法官的共同參與。這一制度的引入雖增強了審判的多元性與公信力，但也帶來了新的挑戰：每位參與者對犯罪罪名及刑期的理解可能因主觀認知差異而導致審判結果不一致。因此，如何利用人工智慧技術輔助審判，降低主觀偏差對結果的影響，成為一項值得深入研究的課題。本研究提出一種基於預訓練語言模型 BERT（Bidirectional Encoder Representations from Transformers）的司法審判輔助方法。BERT 模型在實務應用中面臨輸入字元長度限制的挑戰，根據本研究所蒐集之判決書文字長度，有高達85.43\\\\\\\\\\\\\\\\%判決書文字皆超過BERT模型之限制，本研究提出改良BERT模型方法，稱為「chunking BERT」，該方法將超長文字進行分塊處理，透過上下文關聯設計達到延展輸入長度，且保持模型在預測上的準確性。本研究以「傷害罪」及「公共危險罪」為例，透過chunking BERT，可準確依據犯罪事實描述達到預測犯罪罪名及審判刑期，使用判決書做為資料集進行語言模型的微調階段，將犯罪情狀與事實描述套用在微調後的語言模型，對於犯罪嫌疑人可能觸犯的罪名與刑期預測，並提供審判者做為參考，避免每個人對於審判結果期待與認知差異過大影響審判結果。研究結果顯示，採用chunking BERT後，預測犯罪罪名的精確度將從原本的97.53%提升為98.95%。犯罪刑期部分，傷害罪案件的精確度由原本的68.82%提升為72.37%；公共危險罪案件的精確度從原本的73.03%提升為80.93%。此外，由於司法案件數量龐大，司法人員除審判工作，還需花費大量精力撰寫判決書，加重工作壓力，為了減輕司法人員負擔，本研究利用生成式語言模型Llama（Large Language Model Meta AI）進行判決書生成。以「傷害罪」為例，利用大語言模型取出判決書關鍵字，再透過大語言模型生成判決書，本研究以Llama為基礎，透過監督式微調（Supervised Fine-Tuned）及RLHF-PPO（Reinforcement Learning from Human Feedback with Proximal Policy Optimization），生成具高效性與準確性的法律判決書。本研究使用ROUGE做為評估指標，驗證生成模型的效果，以chinese alpaca 2 7B為例，使用監督式微調與RLHF-PPO的ROUGE值都有所提升。綜上，本研究旨在探討人工智慧與大型語言模型技術在法律領域中的應用可行性與實際成效。透過此類技術的輔助，期望能有效減少審判過程中因人為主觀因素所造成的影響，並藉由自動生成判決書的功能，降低司法人員工作負擔，從而進一步提升審判效率與品質。	zh_TW
dc.description.abstract	With the rapid development of artificial intelligence (AI), various professional fields have continually introduced innovative applications covering multiple aspects, such as machine learning, deep learning, and large language models (LLMs). Driven by significant advancements in computing resources, AI technology has been evolving at an astonishing pace. This study examines the application of deep learning and LLMs in legal trials and judgment generation, aiming to validate their feasibility and assess their actual performance. With the formal implementation of the lay judge system in recent years, the judicial process has shifted from being led solely by professional judges to a collaborative effort involving lay citizens and judges. While this system enhances the diversity and credibility of trials, it also presents new challenges: differences in subjective understanding among participants regarding criminal charges and sentencing may lead to inconsistent trial outcomes. Consequently, how to leverage AI to aid in trials and reduce the influence of subjective bias on the results has become a subject warranting in-depth research. This study proposes a judicial decision support method based on the pre-trained language model BERT (Bidirectional Encoder Representations from Transformers). In practical applications, BERT faces challenges posed by input length limitations. According to the judgment texts collected in this study, as many as 85.43% exceed the input length limit for the BERT model. To address this issue, we introduce an improved BERT model called “chunking BERT.＂This approach segments overly long texts into smaller chunks and leverages contextual associations to extend the input length while preserving the model's predictive accuracy. Using “injury＂ and “public endangerment＂ as examples, this study demonstrates how the chunking BERT approach can accurately predict criminal charges and sentencing based on descriptions of criminal facts. During the fine-tuning stage, judgment documents serve as the dataset for the language model. Applying the circumstances and factual descriptions of crimes to the fine-tuned language model makes it possible to predict the charges and sentencing that a defendant may face, providing a reference for decision-makers and mitigating the impact of differing individual expectations and perceptions on trial outcomes. The results show that after adopting chunking BERT, the accuracy of predicting criminal charges improves from 97.53% to 98.95%. Regarding sentencing predictions, accuracy for injury cases increases from 68.82% to 72.37%, while accuracy for public endangerment cases increases from 73.03% to 80.93%. In addition, due to the large number of judicial cases, judicial personnel handle trial-related tasks and spend a considerable amount of time drafting judgments, thereby increasing their workload. To reduce this burden, this study employs the generative language model Llama (Large Language Model Meta AI) to generate judgments. Using “injury＂as an example, an LLM is utilized to extract the key information from a judgment. Then, that same model is used to generate the judgment text. Building upon Llama, this study adopts supervised fine-tuning (SFT) and RLHF-PPO (Reinforcement Learning from Human Feedback with Proximal Policy Optimization) to produce legal judgments that are both highly efficient and accurate. This study uses ROUGE as an evaluation metric to validate the effectiveness of generative models. Taking chinese Alpaca 2 7B as an example, ROUGE scores improved through SFT and RLHF-PPO training. In summary, this study explores the feasibility and practical effectiveness of applying AI and LLMs in the legal domain. By leveraging these technologies, we aim to significantly reduce the impact of human subjectivity in the trial process. Additionally, through automatic judgment generation, we seek to reduce the workload of judicial personnel, thereby further improving the efficiency and quality of trials.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-19T16:39:25Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-02-19T16:39:25Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements iii 摘要v Abstract vii Contents xi List of Figures xv List of Tables xix Chapter 1 Introduction 1 Chapter 2 Preliminaries 9 2.1 Background Knowledge 9 2.1.1 LLMs (Large Language Models) 9 2.1.2 LoRA (Low-Rank Adaptation) 11 2.1.3 Citizen Judge 12 2.2 Problem Description 13 Chapter 3 Related Work 17 3.1 Legal Artificial Intelligence (Legal AI) 17 3.2 Legal Judgment Prediction (LJP) 19 3.3 Language Model in the Legal Domain 22 3.3.1 Generative Language Models in the Legal Domain 22 3.3.2 Discriminative Language Models in the Legal Domain 28 3.4 Document Drafting 31 Chapter 4 Using Discriminative Language Models in the Legal Domain 33 4.1 Problem Definition 33 4.2 Materials 34 4.3 Proposed Method 40 4.3.1 Performance Metrics 40 4.3.1.1 Loss Function 40 4.3.1.2 Accuracy, Precision, Recall, and F1 42 4.3.2 Flowchart 42 4.3.3 Model Architecture 42 4.4 Experiments 48 4.4.1 Criminal Charge Prediction 48 4.4.2 Sentence Prediction 49 4.5 Result 53 4.6 Summary 58 Chapter 5 Using Generative Language Models in the Legal Domain 61 5.1 Problem Definition 61 5.2 Materials 62 5.3 Proposed Method 63 5.3.1 Performance Metrics 65 5.3.2 Process of Extracting Keywords 67 5.3.3 Reward Model 67 5.4 Results 68 5.4.1 SFT (Supervised Fine-Tuning) 69 5.4.2 RLHF-PPO (Reinforcement Learning with Human Feedback using Proximal Policy Optimization) 71 5.5 Summary 82 Chapter 6 Conclusions and Future Works 85 6.1 Conclusions 85 6.2 Future Works 86 References 89 Appendix A — The example of generative judgment after RLHF-PPO 103	-
dc.language.iso	en	-
dc.subject	生成式人工智慧	zh_TW
dc.subject	人工智慧	zh_TW
dc.subject	法律判決預測	zh_TW
dc.subject	自然語言處理	zh_TW
dc.subject	大型語言模型	zh_TW
dc.subject	Artificial Intelligence	en
dc.subject	Natural Language Processing	en
dc.subject	Large Language Models	en
dc.subject	Legal Judgment Prediction	en
dc.subject	Generative Artificial Intelligence	en
dc.title	基於大型語言模型在法律判決預測之分析與應用	zh_TW
dc.title	Analysis and Application of Large Language Models in Legal Judgment Prediction	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	博士	-
dc.contributor.oralexamcommittee	顏嗣鈞;王勝德;郭斯彥;盧淑萍;許之凡	zh_TW
dc.contributor.oralexamcommittee	Hsu-chun Yen;Sheng-De Wang;Sy-Yen Kuo;Shu-Ping Lu;Chih-Fan Hsu	en
dc.subject.keyword	大型語言模型,自然語言處理,法律判決預測,生成式人工智慧,人工智慧,	zh_TW
dc.subject.keyword	Large Language Models,Natural Language Processing,Legal Judgment Prediction,Generative Artificial Intelligence,Artificial Intelligence,	en
dc.relation.page	107	-
dc.identifier.doi	10.6342/NTU202500111	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-01-15	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
dc.date.embargo-lift	2025-02-20	-
Appears in Collections:	電機工程學系

Files in This Item:

File	Size	Format
ntu-113-1.pdf Access limited in NTU ip range	15.97 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets