Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83920
Title: 美國 SEC 10-Q 季報項目擷取
Item Extraction for SEC 10-Q Reports
Authors: Huan-Hsun Yen
顏煥勳
Advisor: 盧信銘(Hsin-Min Lu)
Keyword: 10-Q 財報,自然語言處理,項目擷取,條件隨機域,雙向長短期記憶,注意力,端到端,
10-Q Report,Natural Language Processing,Item Extraction,CRF,Bi-LSTM,Attention,End-to-end,
Publication Year : 2022
Degree: 碩士
Abstract: 隨著數據分析的快速發展和財務數據的高速增長,越來越多的研究關注於文字分析在財報中的應用。美國證券交易委員會 (SEC) 要求的年度 10-K 財報和季度 10-Q 財報一直是金融文字分析研究人員的熱門研究主題。但是,財報中的幾個財報項目比較常被單獨研究與討論。因為財報項目擷取的品質會影響後續研究的結果,因此如何準確提取研究人員需要的特定財報項目成為一個重要的問題。在我們的實驗中,我們提出了一個基於注意力的 BiLSTM-CRF 模型來從 10-Q 財報中擷取財報項目。首先,我們設計了人工標註規則,並為訓練過程提供了一個 10-Q 財報項目擷取的資料集。其次,我們建立了一個基於注意力的 BiLSTM-CRF 模型。本模型是一個端到端模型,不需要複雜的特徵設計以及對於特定任務的背景知識,我們首先輸入 10-Q 財報中的所有單詞,然後模型依序輸出每一行的標記決策。我們的研究結果表明,基於注意力的 BiLSTM-CRF 模型在擷取所有財報項目和擷取特定財報項目中都有良好的表現。
With the rapid development in data analysis and high-speed growth in financial data, more and more researchers are focusing on financial report text analysis. 10-K and 10-Q reports required by the United States Securities and Exchange Commission (SEC) have always been popular sources for researchers in financial text analysis. However, several items in the financial report are more common to be discussed separately. As the quality of the extracted item affects the subsequent results and analysis. Extracting items accurately becomes an important issue. In this study, we propose an attention-based BiLSTM-CRF model to extract items from the 10-Q reports. We developed human annotation rules and an annotated 10-Q dataset for model training. We have also developed an attention-based BiLSTM-CRF model for item extraction. Our model is an end-to-end model that does not require hand-crafted features and task-specific knowledge. Our model takes tokens from a 10-Q document and outputs the predicted tags for each line. Our experimental results show that the attention-based BiLSTM-CRF model performs well in both the all items extraction and the selected items extraction.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83920
DOI: 10.6342/NTU202201181
Fulltext Rights: 未授權
Appears in Collections:資訊管理學系

Files in This Item:
File SizeFormat 
U0001-2806202217222000.pdf
  Restricted Access
15.04 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved