請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21387
標題: | 深度財務意見探勘 Fine-grained Financial Opinion Mining |
作者: | Chung-Chi Chen 陳重吉 |
指導教授: | 陳信希(Hsin-Hsi Chen) |
關鍵字: | 意見探勘,自然語言處理,財務意見, Opinion Mining,Financial Opinion Mining,Natural Language Processing, |
出版年 : | 2021 |
學位: | 博士 |
摘要: | 意見探勘一直以來都是自然語言處理領域中,非常熱門的話題之一,然而,在財務意見探勘上,多數研究僅止於情緒分析,而未進一步的探討財務意見的細節。在本研究中,我們深入探討財務意見的要點,並著重於投資人意見探勘的研究,除了對當前的相關研究進行調查,亦列出12項探討投資人意見的重要項目,並針對投資人意見的特性設計創新的任務、標記專家資料集以及建立神經網路模型進行深度投資人意見探勘。 金融市場中的資訊可以分為四種,公司發布的官方資訊、新聞發布的事件資訊、分析師發布的分析結果以及財務社群平台上群眾的討論,本研究於財務社群上進行多項深度意見探勘的研究,包括市場情緒分析、意見依據、數字理解、數字依附、意見品質等研究議題,並延伸前述項目的研究成果到其他三種正式文件上,除提升意見探勘的準確度外,亦進行跨文件的推論及分析。 為了深入探索財務意見內容,本研究發布五份專家標記的資料集及兩份語料,基於統計分析及機器學習模型的實驗結果,本研究發現(1)市場情緒和一般情緒不同,兩者應分開進行判斷及分析;(2)數字資訊在財務文本中所佔的比例較其他文本高,且理解數字資訊可以提升模型理解財務文本的能力;(3)將數字資訊以特殊的方式編碼,可以提升模型對數字資訊的理解能力;(4)透過與分析師報告中的分析依據比較,我們可以找出高品質的群眾意見:(5)依本研究方法所取出的群眾意見,於獲利及風險兩個面向均可與專業分析師媲美。 本研究亦將研究成果發展為展示系統,提供實時的資訊供投資人參考,並將深度意見探勘的研究成果延伸到財務領域的應用,包含財務社群媒體個人化推薦系統、優化傳統以關鍵字為基礎的經濟政策不確定性指標以及提升於公司法說會中Non-GAAP項目擷取的準確度。 Opinion mining is one of the most popular topics in the Natural Language Processing field for a long time. However, most of the previous works on financial opinion mining still only pay their attention to sentiment analysis, but not take a close look at the fine-grained information embedding in a financial opinion. In this thesis, we provide a research agenda for the fine-grained financial opinion mining and list twelve components in a financial opinion. We probe several novel challenges in fine-grained financial opinion mining, provide the related expert-annotated datasets, and propose some insights for constructing the neural network models for these tasks. In the financial market, the information sources can be separated into four kinds: (1) the official information published by companies, (2) the events mentioned in the news articles, (3) the analysis reports, and (4) the discussion on financial social media platforms. We first explore the issues in financial social media data, including market sentiment analysis, aspect analysis, numeral understanding task, numeral attachment task, and opinion quality issue, then extend the results to other information sources. Besides, we also provide the experiments and comparisons on the cross sources inference issue. For probing financial opinion mining, we propose five datasets and two corpora, including FinNum, NumAttach, NumClaim, MultiNum, Numeracy-600K, NTUSD-Fin, and Fin-SoMe. Based on our experimental results and statistics, we have the following findings. First, market sentiment is different from the general sentiment. That means we need to consider the writer’s sentiment and the writer’s market sentiment separately. Second, the proportion of numerals in financial documents is higher than that in the documents in other domains. Understanding the meaning of the numerals is helpful for understanding the financial documents. Third, independently encoding the numeral information is needed and it is helpful for improving the performance of neural network models on numeral understanding tasks. Fourth, we can mine the high-quality opinions from the crowd by comparing their rationales with professional analysts. Last, from both profitability and risk control aspects, the crowd opinions extracted by the proposed methods are comparable, even better, than the professional analysts’ opinions. We further construct the demonstration systems for real-world usage and apply our results to the issues in the financial domain. For example, we propose a personal recommendation system for users on financial social media platforms. We also improve the explainability and the predictability of the economic policy index and propose a method to extract the Non-GAAP metrics in earning conference call. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21387 |
DOI: | 10.6342/NTU202100175 |
全文授權: | 未授權 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-2601202113044100.pdf 目前未授權公開取用 | 32.07 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。