Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20199
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor項潔(Jieh Hsiang)
dc.contributor.authorCHIH-YANG HUANGen
dc.contributor.author黃志揚zh_TW
dc.date.accessioned2021-06-08T02:42:04Z-
dc.date.copyright2021-03-11
dc.date.issued2021
dc.date.submitted2021-01-18
dc.identifier.citation[1] B. Lubars and C. Tan, “Ask not what ai can do, but what ai should do: Towards a framework of task delegability,” in Advances in Neural Information Processing Systems, 2019, pp. 57–67.
[2] M. B. L. Virtucio, J. A. Aborot, J. K. C. Abonita, R. S. Aviñante, R. J. B. Copino, M. P. Neverida, V. O. Osiana, E. C. Peramo, J. G. Syjuco, and G. B. A. Tan, “Predicting decisions of the philippine supreme court using natural language pro- cessing and machine learning,” in 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), IEEE, vol. 2, 2018, pp. 130–135.
[3] J.-S. Lee and J. Hsiang, “Patentbert: Patent classification with fine-tuning a pre- trained bert model,” arXiv preprint arXiv:1906.02124, 2019.
[4] 黃詩淳 and 邵軒磊, “以人工智慧讀取親權酌定裁判文本: 自然語言與文字探 勘之實踐,” 臺大法學論叢, vol. 49, no. 1, pp. 195–224, 2020.
[5] P.-K. Do, H.-T. Nguyen, C.-X. Tran, M.-T. Nguyen, and M.-L. Nguyen, “Legal question answering using ranking svm and deep convolutional neural network,” arXiv preprint arXiv:1703.05320, 2017.
[6] Z. S. Harris, “Distributional structure,” WORD, vol. 10, no. 2-3, pp. 146–162, 1954. DOI: 10.1080/00437956.1954.11659520. eprint: https://doi.org/10.1080/00437956.1954.11659520. [Online]. Available: https://doi.org/10.1080/00437956.1954.11659520.
[7] A. Rajaraman and J. D. Ullman, Mining of massive datasets. Cambridge University Press, 2011.
[8] K. S. Jones, “A statistical interpretation of term specificity and its application in retrieval,” Journal of documentation, 1972.
[9] S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, M. Gatford, et al., “Okapi at trec-3,” Nist Special Publication Sp, vol. 109, p. 109, 1995.
[10] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “Liblinear: A library for large linear classification,” Journal of machine learning research, vol. 9, no. Aug, pp. 1871–1874, 2008.
[11] H.-F. Yu, F.-L. Huang, and C.-J. Lin, “Dual coordinate descent methods for logistic regression and maximum entropy models,” Machine Learning, vol. 85, no. 1-2, pp. 41– 75, 2011.
[12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[13] A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, and J. Schmidhuber, “A novel connectionist system for unconstrained handwriting recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 31, no. 5, pp. 855–868, 2008.
[14] A. Graves and J. Schmidhuber, “Offline handwriting recognition with multidimensional recurrent neural networks,” in Advances in neural information processing systems, 2009, pp. 545–552.
[15] H. Sak, A. W. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic modeling,” 2014.
[16] X. Li and X. Wu, “Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2015, pp. 4520–4524.
[17] 段品任, “人工智慧於法律的應用–死刑與無期徒刑的案件分類,” 成功大學經 濟學系碩士學位論文, pp. 1–63, 2019.
[18] D. M. Katz, M. J. Bommarito, and J. Blackman, “A general approach for predicting the behavior of the supreme court of the united states,” PloS one, vol. 12, no. 4, e0174698, 2017.
[19] H. Zhong, C. Xiao, C. Tu, T. Zhang, Z. Liu, and M. Sun, “How does nlp benefit legal system: A summary of legal artificial intelligence,” arXiv preprint arXiv:2004.12158, 2020.
[20] W.-C. Lin, T.-T. Kuo, T.-J. Chang, C.-A. Yen, C.-J. Chen, and S.-D. Lin, “利 用 機 器學習於中文法律文件之標記, 案件分類及量刑預測 (exploiting machine learning models for chinese legal documents labeling, case classification, and sentencing prediction) [in chinese],” in International Journal of Computational Linguistics Chinese Language Processing, Volume 17, Number 4, December 2012-Special Issue on Selected Papers from ROCLING XXIV , 2012.
[21] 林琬真, “機器學習於中文法律文件之標記與分類,” 臺灣大學資訊工程學研究所碩士學位論文, pp. 1–34, 2012.
[22] 黃詩淳 and 邵軒磊, “酌定子女親權之重要因素: 以決策樹方法分析相關裁判,” 臺大法學論叢, vol. 47, no. 1, pp. 299–344, 2018.
[23] J. L. Bellovary, D. E. Giacomino, and M. D. Akers, “A review of going concern prediction studies: 1976 to present,” Journal of Business Economics Research (JBER), vol. 5, no. 5, 2007.
[24] S. K. Pal and Pabitra Mitra, “Case generation using rough sets with fuzzy representation,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 3, pp. 293–300, 2004.
[25] Z. Pawlak and A. Skowron, “Rudiments of rough sets,” Information sciences, vol. 177, no. 1, pp. 3–27, 2007.
[26] W.-Z. Wu, W.-X. Zhang, and H.-Z. Li, “Knowledge acquisition in incomplete fuzzy information systems via the rough set approach,” Expert Systems, vol. 20, no. 5, pp. 280– 286, 2003.
[27] C.-C. Yeh, F. Lin, and C.-Y. Hsu, “A hybrid kmv model, random forests and rough set theory approach for credit rating,” Knowledge-Based Systems, vol. 33, pp. 166– 172, 2012.
[28] R. J. Taffler, “Forecasting company failure in the uk using discriminant analysis and financial ratio data,” Journal of the Royal Statistical Society: Series A (General), vol. 145, no. 3, pp. 342–358, 1982.
[29] A. J. Conejo, M. Carrión, J. M. Morales, et al., Decision making under uncertainty in electricity markets. Springer, 2010, vol. 1.
[30] T. Loughran and B. Mcdonald, “When is a liability not a liability? textual analysis, dictionaries, and 10- ks,” The Journal of Finance, vol. 66, pp. 35–65, Feb. 2011. DOI: 10.1111/j.1540-6261.2010.01625.x.
[31] C. Nopp and A. Hanbury, “Detecting risks in the banking system by sentiment analysis,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal: Association for Computational Linguistics, Sep. 2015, pp. 591–600. DOI: 10 . 18653 / v1 / D15 - 1071. [Online]. Available: https : / / www . aclweb.org/anthology/D15-1071.
[32] C. K. Theil, S. Broscheit, and H. Stuckenschmidt, “Profet: Predicting the risk of firms from event transcripts.,” in IJCAI, 2019, pp. 5211–5217.
[33] 司法裁判書開放資料, http://data.judicial.gov.tw Accessed September 20, 2020.
[34] 商工登記公示資料查詢服務, https://findbiz.nat.gov.tw/fts/query/QueryBar/queryInit.do Accessed September 20, 2020.
[35] 經濟部商業司, 經濟部商業司, https://gcis.nat.gov.tw/cod/browseAction.do? method=browse Accessed September 20, 2020.
[36] M.-F. Tsai and C.-J. Wang, “Visualization on financial terms via risk ranking from financial reports,” in Proceedings of COLING 2012: Demonstration Papers, Mumbai, India: The COLING 2012 Organizing Committee, Dec. 2012, pp. 447–452. [Online]. Available: https://www.aclweb.org/anthology/C12-3056.
[37] C.-J. Wang, M.-F. Tsai, T. Liu, and C.-T. Chang, “Financial sentiment analysis for risk prediction,” in Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013, pp. 802–808.
[38] Y.-W. Liu, L.-C. Liu, C.-J. Wang, and M.-F. Tsai, “Fin10k: A web-based information system for financial report analysis and visualization,” in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, 2016, pp. 2441– 2444.
[39] M.-F. Tsai, C.-J. Wang, and P.-C. Chien, “Discovering finance keywords via continuous- space language models,” ACM Transactions on Management Information Systems (TMIS), vol. 7, no. 3, pp. 1–17, 2016.
[40] C.-H. Du, M.-F. Tsai, and C.-J. Wang, “Beyond word-level to sentence-level sentiment analysis for financial reports,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 1562–1566.
[41] J. Sun, Jieba ,“結巴” 中文斷詞, https://github.com/fxsjy/jieba Accessed September 20, 2020.
[42] Y.-L. Hsieh, Y.-C. Chang, Y.-J. Huang, S.-H. Yeh, C.-H. Chen, and W.-L. Hsu, “Monpa: Multi- objectivenamed- entityandpart- of- speechannotatorforchineseusing recurrent neural network,” in Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2017, pp. 80–85.
[43] P.-H. Li, T.-J. Fu, and W.-Y. Ma, “Why attention? analyze bilstm deficiency and its remedies in the case of ner.,” in AAAI, 2020, pp. 8236–8244.
[44] W.-Y. Ma and K.-J. Chen, “Design of ckip chinese word segmentation system,” Chinese and Oriental Languages Information Processing Society, vol. 14, no. 3, pp. 235–249, 2005.
[45] 司法院裁判書用語辭典清單, https://git.io/JUEph Accessed September 20, 2020.
[46] 司法院, 裁判書用語辭典資料庫查詢系統-名詞收集, https://terms.judicial.gov. tw/Collect.aspx Accessed September 20, 2020.
[47] 自由百科有限公司,法律百科legispedia,https://www.legis-pedia.com/ Accessed September 20, 2020.
[48] 國家教育研究院, 國家教育研究院法律雙語詞彙, http://terms.naer.edu.tw/ download/ Accessed September 20, 2020.
[49] Q. Wu, C. Xing, Y. Li, G. Ke, D. He, and T.-Y. Liu, “Taking notes on the fly helps bert pre-training,” arXiv preprint arXiv:2008.01466, 2020.
[50] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
[51] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, 2013, pp. 3111–3119.
[52] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008.
[53] P. Zhou, W. Shi, J. Tian, Z. Qi, B. Li, H. Hao, and B. Xu, “Attention-based bidirectional long short- term memory networks for relation classification,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2016, pp. 207–212.
[54] Z. Lin, M. Feng, C. N. d. Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio, “A structured self-attentive sentence embedding,” arXiv preprint arXiv:1703.03130, 2017.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20199-
dc.description.abstract法律是一整套形式化、意義明確的法規條文,當今社會皆依其規範運作。人們的社會活動中,若有糾紛或利益衝突時,便會尋求法官根據法律進行判決,最終產出法律判決書,法律可謂反映著當今社會的人類活動與社會狀態。法院判決行之有年,累積的判例卷帙浩繁,龐大的資料量迄今已超出人力處理得以企及的程度;因此,以電腦輔助處理,在現代已是大勢所趨,本研究將聚焦討論「公司經歷的法律案件」對於「公司是否續存」的影響。公司行號經營事業,時常面臨各式訴訟,尤其在公司經營狀況不佳時,隨之而來的法律訴訟,例如:債務違約、侵權、掏空資產等,對於公司行號的存續與否將是相當大的考驗,而法律判決書,往往可以一窺公司的經營問題。在判決過後,法院會出具法律判決書,以做為訴訟的紀錄。判決書中包含了兩造雙方、案由、判決理由、最終判決等內容,為資訊含量極高的文件。綜觀過去基於人工智慧的法律研究,多半聚焦於判決類型的分類和刑期的預測,對於公司行號的經營風險則甚少著墨。有鑑於此,本論文將自然語言的處理技術應用於法律領域,剖析法律判決書,透過歷年的判決資料,結合公司營運狀況,試圖建立模型,找出公司倒閉風險與法律判決書之間的關聯,使研究者得以迅速掌握公司風險概況。其次,我們也嘗試在判決書全文中找出一些有意義的文字,並 分析公司行號的營運風險與這些詞彙之間的關係。本研究將自然語言處理的常用技術應用於法律文本上,並比較了幾個不同的機器學習模型,以期許能為公司經營與法律的分析提供另一條不同的研究取徑。zh_TW
dc.description.abstractLaw, according to which people are living, is a system regulating the present society with specific, formal, and clear codes. If a conflict or dispute occurs in the activity of society, people often resort to the court to have their problem settled by the law, and the result of the judgment is named verdict. Through verdicts, we are able to get a quick grasp of the status and activity of the society.So far, there are plentiful verdicts, and such abundance makes manual search and analysis almost impractical. However, with the assistance of computers, people could deal with that easily, which is also the trending procedure in the world. In this research, we will focus on detection of company failure risks from historical verdicts of a company.A company could be accused of various things, especially when the management is poor. For example, a company would face debt default, infringement or hollowed out assets, to name a few, and such issues are critical for the operation of a company. However, based on past verdicts of a company, we could peek the operation status of a company. A verdict is a record of judgment issued by the court after the trial. A verdict, as an informative document, is composed of litigants, cause of the case, reason and the final decision, etc. So far, most previous researches on verdict analysis using artificial intelligence focus on judgment classification and sentence prediction, but there is rather few focusing on company failure risks. In the light of this, in this thesis, we will apply natural language processing skills on the field of law to handle the verdicts, and integrate past verdicts and the business status of a company to assess the company failure risk and to accelerate the process of understanding the risk of a company. Next, we recognize some interesting and meaningful terms in the verdicts, discovering the relation between a company and these terms.We use some common natural language processing skills on the judgment documents, comparing the different machine learning models as an expectation of providing another novel research approach.en
dc.description.provenanceMade available in DSpace on 2021-06-08T02:42:04Z (GMT). No. of bitstreams: 1
U0001-2009202020205100.pdf: 7797551 bytes, checksum: d5c71dcade2f056f9dfd5663e7b4417e (MD5)
Previous issue date: 2021
en
dc.description.tableofcontents口試委員會審定書 i
Acknowledgements ii
摘要 iii
Abstract v
1 緒論 1
1.1 背景與動機................................. 1
1.2 論文架構.................................. 3
2 相關研究 4
2.1 詞袋模型.................................. 4
2.1.1 One-HotEncoding ......................... 5
2.1.2 CounterEncoding ......................... 5
2.1.3 TF–IDF........... .................... 5
2.1.4 BM25............ .................... 6
2.2 線性回歸.................................. 6
2.3 邏輯回歸.................................. 7
2.4 人工神經網路 ............................... 8
2.5 長短期記憶................................. 9
2.6 人工智慧於法律的應用 .......................... 10
2.7 人工智慧於公司資料之應用 ....................... 10
3 研究資料 13
3.1 法律判決書................................. 13
3.2 公司基本資訊 ............................... 18
4 法律判決書全文詞彙風險因子分析 21
4.1 法律判決書的關係人資訊辨識與標記 .................. 22
4.2 法律詞彙與判決書斷詞 .......................... 24
4.3 文件特徵向量 ............................... 25
4.4 邏輯回歸訓練 ............................... 28
4.5 邏輯回歸測試 ............................... 28
4.5.1 隨機測試.............................. 29
4.5.2 特定公司測試 ........................... 29
4.6 詞彙風險排序 ............................... 31
5 公司倒閉風險評估模型 34
5.1 輸入資料.................................. 34
5.2 公司基本資訊分析............................. 35
5.3 實驗設計.................................. 37
5.4 資料編碼.................................. 39
5.4.1 判決書案由序列特徵向量 .................... 39
5.4.2 公司基本資訊 ........................... 43
5.4.3 法律詞彙.............................. 43
5.5 實驗結果.................................. 44
5.5.1 判決書案由序列特徵向量 .................... 44
5.5.2 判決書案由序列特徵向量以及公司基本資訊.......... 45
5.5.3 判決書案由序列特徵向量、公司基本資訊以及法律詞彙 . . . 46
5.5.4 模型驗證.............................. 48
5.5.5 模型比較.............................. 48
6 系統呈現與公司案例分析 51
6.1 系統呈現.................................. 51
6.1.1 公司名稱查詢 ........................... 51
6.1.2 公司資訊呈現 ........................... 52
6.2 公司案例分析 ............................... 54
7 結果討論與未來工作 59
7.1 研究結果與貢獻 .............................. 59
7.2 研究限制.................................. 60
7.3 契機與挑戰................................. 60
7.4 未來工作.................................. 60
參考文獻 62
dc.language.isozh-TW
dc.title基於法律判決書之公司倒閉風險評估zh_TW
dc.titleCompany Failure Risk Assessment Based on the Court’s Judgmentsen
dc.typeThesis
dc.date.schoolyear109-1
dc.description.degree碩士
dc.contributor.author-orcid0000-0002-4057-2746
dc.contributor.advisor-orcid項潔(0000-0002-2649-4331)
dc.contributor.oralexamcommittee蔡宗翰(Richard Tzong-Han Tsai),謝育平(Yuh-Pyng Shieh)
dc.subject.keyword自然語言處理,法律文件,風險分析,線性分類器,長短期記憶,zh_TW
dc.subject.keywordnatural language processing,judgment document,risk analysis,linear classification,long short-term memory,en
dc.relation.page68
dc.identifier.doi10.6342/NTU202004222
dc.rights.note未授權
dc.date.accepted2021-01-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-2009202020205100.pdf
  未授權公開取用
7.61 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved