Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97880
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor盧信銘zh_TW
dc.contributor.advisorHsin-Min Luen
dc.contributor.author李宜蓁zh_TW
dc.contributor.authorYi-Jhen Lien
dc.date.accessioned2025-07-21T16:06:57Z-
dc.date.available2025-07-22-
dc.date.copyright2025-07-21-
dc.date.issued2025-
dc.date.submitted2025-07-17-
dc.identifier.citationAl-Okaily, M., Alkayed, H., & Al-Okaily, A. (2024). Does XBRL adoption increase financial information transparency in digital disclosure environment? Insights from emerging markets. International Journal of Information Management Data Insights, 4(1), 100228. https://doi.org/10.1016/j.jjimei.2024.100228
Amin, K., Eshleman, J. D., & Feng, C. (2018). The Effect of the SEC's XBRL Mandate on Audit Report Lags. Accounting Horizons, 32(1), 1-27.
Bartley, J., Chen, A. Y. S., & Taylor, E. Z. (2011). A Comparison of XBRL Filings to Corporate 10-Ks—Evidence from the Voluntary Filing Program. Accounting Horizons, 25(2), 227-245. https://doi.org/10.2308/acch-10028
Basoglu, K. A., & White, C. E. (2015). Inline XBRL versus XBRL for SEC reporting. Journal of Emerging Technologies in Accounting, 12(1), 189-199.
BLANKESPOOR, E. (2019). The Impact of Information Processing Costs on Firm Disclosure Choice: Evidence from the XBRL Mandate. Journal of Accounting Research, 57(4), 919-967. https://doi.org/10.1111/1475-679X.12268
Caruana, R. (1997). Multitask Learning. Machine Learning, 28(1), 41-75. https://doi.org/10.1023/A:1007379606734
Chang, H. W., Kaszak, S., Kipp, P., & Robertson, J. C. (2021). The Effect of iXBRL Formatted Financial Statements on the Effectiveness of Managers' Decisions When Making Inter-Firm Comparisons. Journal of Information Systems, 35(2), 149-177.
Chouhan, V., & Goswami, S. (2015). XBRL acceptance in India: A behavioral study. American Journal of trade and Policy, 2(2), 71-78.
Chowdhuri, R., Yoon, V. Y., Redmond, R. T., & Etudo, U. O. (2014). Ontology based integration of XBRL filings for financial decision making. Decision Support Systems, 68, 64-76. https://doi.org/10.1016/j.dss.2014.09.004
Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., & Brahma, S. (2024). Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70), 1-53.
Cui, Y., Jia, M., Lin, T.-Y., Song, Y., & Belongie, S. (2019). Class-Balanced Loss Based on Effective Number of Samples. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, June). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In J. Burstein, C. Doran, & T. Solorio, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) Minneapolis, Minnesota.
Du, H., Vasarhelyi, M. A., & Zheng, X. (2013). XBRL Mandate: Thousands of Filing Errors and So What? Journal of Information Systems, 27(1), 61-78. https://doi.org/10.2308/isys-50399
ESMA. (n.d.). Electronic Reporting. European Securities and Markets Authority. https://www.esma.europa.eu/issuer-disclosure/electronic-reporting
Fourny, G. (2023). The XBRL Book: Simple, Precise, Technical.
FSA. (2008). FSA launches new electronic corporate disclosure system (EDINET). Financial Services Agency. https://www.fsa.go.jp/en/news/2008/20080317.html
FSA. (2013). Publication of the next generation EDINET taxonomy. Financial Services Agency. https://www.fsa.go.jp/search/20130821.html
Gui, L., Leng, J., Zhou, J., Xu, R., & He, Y. (2022). Multi Task Mutual Learning for Joint Sentiment Classification and Topic Detection. IEEE Transactions on Knowledge and Data Engineering, 34(4), 1915-1927. https://doi.org/10.1109/TKDE.2020.2999489
Gupta, S., Chandel, A., & Bhalla, L. (2023). XBRL adoption and information asymmetry: Evidence from the Indian capital market. Indian Journal of Finance, 17(7), 25-36.
Han, S., Kang, H., Jin, B., Liu, X.-Y., & Yang, S. Y. (2024). XBRL Agent: Leveraging Large Language Models for Financial Report Analysis Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA. https://doi.org/10.1145/3677052.3698614
HMRC. (2020). XBRL guide for businesses. HM Revenue & Customs. https://www.gov.uk/government/publications/xbrl-guide-for-uk-businesses/xbrl-guide-for-uk-businesses
Hodge, F. D., Kennedy, J. J., & Maines, L. A. (2004). Does Search-Facilitating Technology Improve the Transparency of Financial Reporting? The accounting review., 79(3), 687-703.
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2022). LoRA: Low-Rank Adaptation of Large Language Models. International Conference on Learning Representations, 1(2), 3.
Huang, A. H., Wang, H., & Yang, Y. (2023). FinBERT: A large language model for extracting information from financial text. Contemporary Accounting Research, 40(2), 806-841.
Janvrin, D. J., & No, W. G. (2012). XBRL Implementation: A Field Investigation to Identify Research Opportunities. Journal of Information Systems, 26(1), 169-197. https://doi.org/10.2308/isys-10252
Kapil, V., & Martin, D. (2024). Digital Financial Reporting and AI: Is Automatic XBRL Tagging Feasible Using AI and LLM systems. UBPartner. https://www.ubpartner.com/digital-financial-reporting-and-ai/
Khatuya, S., Mukherjee, R., Ghosh, A., Hegde, M., Dasgupta, K., Ganguly, N., Ghosh, S., & Goyal, P. (2024, June). Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling. In K. Duh, H. Gomez, & S. Bethard, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) Mexico City, Mexico.
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2020). Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318-327. https://doi.org/10.1109/TPAMI.2018.2858826
Liu, C., Luo, X., Sia, C. L., O'Farrell, G., & Teo, H. H. (2014). The impact of XBRL adoption in PR China. Decision Support Systems, 59, 242-249. https://doi.org/10.1016/j.dss.2013.12.003
Liu, C., Luo, X., & Wang, F. L. (2017). An empirical investigation on the impact of XBRL adoption on information asymmetry: Evidence from Europe. Decision Support Systems, 93, 42-50. https://doi.org/10.1016/j.dss.2016.09.004
Liu, C., Wang, T., & Yao, L. J. (2014). XBRL’s impact on analyst forecast behavior: An empirical study. Journal of Accounting and Public Policy, 33(1), 69-82. https://doi.org/10.1016/j.jaccpubpol.2013.10.004
Loukas, L., Fergadiotis, M., Chalkidis, I., Spyropoulou, E., Malakasiotis, P., Androutsopoulos, I., & Paliouras, G. (2022, May). FiNER: Financial Numeric Entity Recognition for XBRL Tagging. In S. Muresan, P. Nakov, & A. Villavicencio, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Dublin, Ireland.
Luo, X., Wang, T., Yang, L., Zhao, X., & Zhang, Y. (2023). Initial evidence on the market impact of the iXBRL adoption. Accounting Horizons, 37(1), 143-171.
Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P., & Dokania, P. (2020). Calibrating deep neural networks using focal loss. Advances in neural information processing systems, 33, 15288-15299.
Ni, J., Hernandez Abrego, G., Constant, N., Ma, J., Hall, K., Cer, D., & Yang, Y. (2022, May). Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models. In S. Muresan, P. Nakov, & A. Villavicencio, Findings of the Association for Computational Linguistics: ACL 2022 Dublin, Ireland.
OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., & Anadkat, S. (2023). Gpt-4 technical report.
Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002, July). Bleu: a Method for Automatic Evaluation of Machine Translation. In P. Isabelle, E. Charniak, & D. Lin, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics Philadelphia, Pennsylvania, USA.
Ramanan, R. (2024a). How well do AI models like GPT-4 understand XBRL Data? XBRL International Inc. https://www.xbrl.org/how-well-do-ai-models-like-gpt-4-understand-xbrl-data/
Ramanan, R. (2024b). Narrative disclosure analysis with GPT-4. XBRL International Inc. https://www.xbrl.org/narrative-disclosure-analysis-with-gpt-4/
Ramanan, R. (2025). Leveraging LLMs for Smarter Taxonomy Interactions. XBRL International Inc. https://www.xbrl.org/leveraging-llms-for-smarter-taxonomy-interactions/
Ruder, S. (2017). An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098.
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681. https://doi.org/10.1109/78.650093
SEC. (2008). Interactive Data to Improve Financial Reporting. Securities and Exchange Commission. https://www.sec.gov/files/rules/final/2009/33-9002.pdf
SEC. (2018). SEC Adopts Inline XBRL for Tagged Data. Securities and Exchange Commission. https://www.sec.gov/newsroom/press-releases/2018-117
SEC. (2025). U.S. GAAP – XBRL Custom Tags Trend for 2022 - 2024. SEC. https://www.sec.gov/data-research/structured-data/gaap-xbrl-custom-tags
Sharma, S., Khatuya, S., Hegde, M., Shaikh, A., Dasgupta, K., Goyal, P., & Ganguly, N. (2023, July). Financial Numeric Extreme Labelling: A dataset and benchmarking. In A. Rogers, J. Boyd-Graber, & N. Okazaki, Findings of the Association for Computational Linguistics: ACL 2023 Toronto, Canada.
Taiwan Stock Exchange Corporation. (n.d.). The eXtensible Business Reporting Language (XBRL). Taiwan Stock Exchange Corporation. https://www.twse.com.tw/zh/listed/xbrl.html
Tawiah, V., & Borgi, H. (2022). Impact of XBRL adoption on financial reporting quality: a global evidence. Accounting Research Journal, 35(6), 815-833.
XBRL International. (n.d.). Release History: XBRL - XBRL Specification. XBRL International. https://specifications.xbrl.org/release-history-base-spec-xbrl-2.1.html
XBRL International Inc. (2013). XBRL 2.1. XBRL International Inc.,. https://specifications.xbrl.org/work-product-index-group-base-spec-base-spec.html
XBRL International Inc. (2024). AI and XBRL: automatic tagging? XBRL International Inc. https://www.xbrl.org/news/ai-and-xbrl-automatic-tagging/
XBRL International Inc. (2013). Inline XBRL 1.1. XBRL International Inc.,. https://specifications.xbrl.org/work-product-index-inline-xbrl-inline-xbrl-1.1.html
XBRL International Inc. (2019). iXBRL Tagging Features. XBRL International Inc. https://www.xbrl.org/guidance/ixbrl-tagging-features/
XBRL US. (2020). XBRL Taxonomy Development Handbook XBRL US. https://xbrl.us/xbrl-reference/tdh/
Yang, Y., Uy, M. C. S., & Huang, A. (2020). Finbert: A pretrained language model for financial communications.
Yoon, H., Zo, H., & Ciganek, A. P. (2011). Does XBRL adoption reduce information asymmetry? Journal of Business Research, 64(2), 157-163. https://doi.org/10.1016/j.jbusres.2010.01.008
You, R., Zhang, Z., Wang, Z., Dai, S., Mamitsuka, H., & Zhu, S. (2019). Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. Advances in neural information processing systems, 32.
Yu, J., Marujo, L., Jiang, J., Karuturi, P., & Brendel, W. (2018, oct nov). Improving Multi-label Emotion Classification via Sentiment Classification with Dual Attention Transfer Network. In E. Riloff, D. Chiang, J. Hockenmaier, & J. i. Tsujii, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Brussels, Belgium.
Zhang, Y., & Shan, Y. (2024). Does The Adoption of Ixbrl Improve Data Usability? Evidence from Future Earnings Response Coefficients (Fercs). Evidence from Future Earnings Response Coefficients (Fercs).
Zhang, Y., & Yang, Q. (2022). A Survey on Multi-Task Learning. IEEE Transactions on Knowledge and Data Engineering, 34(12), 5586-5609. https://doi.org/10.1109/TKDE.2021.3070203
Zhang, Z., Yu, W., Yu, M., Guo, Z., & Jiang, M. (2023, May). A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods. In A. Vlachos & I. Augenstein, Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics Dubrovnik, Croatia.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97880-
dc.description.abstract隨著 inline XBRL (iXBRL) 在財務報告中被廣泛與強制使用,能夠提升效率、一致性及法規遵循的自動化標記需求日益增加。人工標記不僅耗時費力,且容易出錯。然而,現有資料集在屬性涵蓋範圍及文本資訊上仍有不足,限制了模型效能及其在真實應用場景中的適用性。為解決上述問題,本研究從美國證券交易委員會(SEC)申報文件中建構出一套大規模 iXBRL 標記資料集,包含 660 萬筆句子。每筆資料包含多個屬性,如標籤名稱(tag name)、財務數值(fact value)、時間(time)等,並附有財務報表資訊及句子上下文資訊。本研究亦將標記任務重新定義為多屬性預測問題,以更真實地反映實際財務標記的複雜性。我們提出三階段任務設計,提升資料集在不同模型與應用場景中的靈活性與可用性。本研究比較兩種方法:基於 BERT 架構的多任務學習(multi-task learning, MTL)模型,以及大型語言模型(LLMs)進行的 few-shot 提示學習(prompting)。MTL 模型在多個屬性上展現優異的預測能力,於標籤、時間、數量級與正負屬性的加權 F1 分數分別達到 0.82、0.90 及 0.99。相較之下,few-shot LLMs 表現則相對較差。這一結果凸顯了大型語言模型和few-shot在財務領域的限制。此外,本研究所建資料集亦能協助發現過往申報文件中的標記不一致或潛在錯誤,展現其作為訓練資源與稽核工具的價值。從管理實務觀點而言,本研究提出的資料集與自動標記框架可大幅降低人工工作量,提升財務揭露的可靠性,並促進後續應用,例如對股東會資料、財報摘要,或在 XBRL 強制採用前發布之歷史財報的結構化分析。zh_TW
dc.description.abstractThe growing adoption of inline XBRL (iXBRL) in financial reporting has increased the need for automation to improve efficiency, consistency, and compliance. Manual tagging is often labor-intensive and error-prone, especially in corporate settings where financial filings must meet strict regulatory standards. While automated iXBRL tagging has shown promise, existing datasets lack comprehensive attribute coverage and contextual information, limiting model performance and real-world applicability. To address these limitations, this study constructs a large-scale iXBRL tagging dataset containing 6.6 million sentences from SEC filings. Each instance is annotated with multiple attributes, such as tag name, fact value, and time, as well as document metadata and surrounding sentence context. We reformulate the tagging task as a multi-attribute prediction problem, which better reflects the complexity of real-world financial reporting. A three-stage task design is proposed to improve the flexibility and usability of the dataset for various modeling approaches and applications. To establish performance baselines, we evaluate two methods: a multi-task learning (MTL) model based on a BERT architecture and few-shot prompting using large language models (LLMs). The MTL model demonstrates strong predictive capabilities across attributes, achieving weighted F1 scores of 0.82 for tag, 0.90 for time, and 0.99 for both scale and sign. By contrast, few-shot LLMs perform relatively worse. These findings reveal the current limitations of prompt-based approaches and highlight opportunities for future improvement through domain-adaptive pretraining and advanced prompting strategies. Additionally, the dataset helped uncover annotation inconsistencies and potential errors in previous filings, highlighting its value as both a training resource and an auditing aid. From a managerial perspective, the proposed dataset and automated tagging framework can significantly reduce human workload, enhance the reliability of financial disclosures, and enable downstream applications, such as financial analysis of shareholder meeting materials or financial reports published before the mandatory XBRL adoption.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-21T16:06:57Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-07-21T16:06:57Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 i
中文摘要 ii
ABSTRACT iii
TABLE OF CONTENTS v
LIST OF FIGURES ix
LIST OF TABLES x
Chapter 1 Introduction 1
Chapter 2 Literature Review 5
2.1 XBRL 5
2.1.1 Overview of XBRL 5
2.1.2 XBRL Structure and Component 6
2.1.2.1 Reported Facts 6
2.1.2.2 XBRL Instance Document 7
2.1.2.3 Concepts 8
2.1.2.4 XBRL Taxonomy 9
2.1.3 XBRL Adoption 10
2.1.3.1 Global Adoption of XBRL 10
2.1.3.2 Advantages of XBRL Adoption 11
2.1.3.3 Challenge of XBRL Adoption 12
2.2 Inline XBRL 12
2.2.1 Overview of iXBRL 13
2.2.2 iXBRL Structure and Tagging 13
2.2.2.1 iXBRL Report Structure 13
2.2.2.2 iXBRL Tagging 15
2.2.2.2.1 Format 16
2.2.2.2.2 Scale 16
2.2.2.2.3 Sign 16
2.2.2.2.4 Textual data 17
2.2.3 iXBRL Adoption 18
2.2.3.1 Global Adoption of iXBRL 18
2.2.3.2 Advantages of iXBRL Adoption 19
2.3 NLP, LLM, and XBRL 20
2.3.1 LLM-based Analysis of XBRL 20
2.3.2 Automated Tagging in XBRL 21
2.4 Multi-task Learning 22
2.5 Research Gap 23
Chapter 3 Methodology 27
3.1 Dataset 27
3.1.1 Dataset Overview 27
3.1.2 Dataset Construction Process 28
3.1.3 Data Structure 31
3.2 Task Definition 34
3.2.1 Stage 1 35
3.2.2 Stage 2 36
3.2.3 Stage 3 36
3.3 Multi-task Learning Model 36
3.3.1 Backbone model 37
3.3.2 Task-specific Layer 38
3.3.3 Loss Function 38
Chapter 4 Experiment 41
4.1 Dataset Summary Statistics 41
4.1.1 Overall Summary Statistics 41
4.1.2 Attribute Summary Statistics 43
4.1.2.1 Tag 43
4.1.2.2 Fact 44
4.1.2.3 Time 46
4.1.2.4 Measure 47
4.1.2.5 Scale 48
4.1.2.6 Decimals 49
4.1.2.7 Axis and Member 50
4.1.3 Textual Context Summary Statistics 53
4.2 Experiment Implementation 55
4.2.1 Multi-task Learning Model 55
4.2.1.1 Data Preparation 55
4.2.1.2 Model Architecture 56
4.2.1.3 Loss Functions 58
4.2.2 Few-shots Learning using LLMs 58
4.2.3 Evaluation Metrics 59
4.3 Experiment Result 60
4.3.1 Result of Multi-task Learning Model 60
4.3.2 Result of Few-shot Learning 61
4.4 Error Analysis 63
4.4.1 Tag 63
4.4.2 Time 65
4.4.3 Scale 67
4.4.4 Negative 70
4.5 Managerial Implications 71
Chapter 5 Conclusion 74
REFERENCE 76
APPENDIX 84
-
dc.language.isoen-
dc.subjectXBRLzh_TW
dc.subject多任務學習zh_TW
dc.subject屬性預測zh_TW
dc.subject財務資料集zh_TW
dc.subjectiXBRLzh_TW
dc.subjectXBRLzh_TW
dc.subject多任務學習zh_TW
dc.subject屬性預測zh_TW
dc.subject財務資料集zh_TW
dc.subjectiXBRLzh_TW
dc.subjectXBRLen
dc.subjectMulti-task Learningen
dc.subjectXBRLen
dc.subjectiXBRLen
dc.subjectFinancial Dataseten
dc.subjectAttribute Predictionen
dc.subjectMulti-task Learningen
dc.subjectAttribute Predictionen
dc.subjectFinancial Dataseten
dc.subjectiXBRLen
dc.title金融數值實體理解:任務、資料集與方法zh_TW
dc.titleFinancial Numerical Entity Understanding: Novel Tasks, Datasets, and Approachesen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee顏如君;謝昇峯zh_TW
dc.contributor.oralexamcommitteeJu-Chun Yen;Sheng-Feng Hsiehen
dc.subject.keywordXBRL,iXBRL,財務資料集,屬性預測,多任務學習,zh_TW
dc.subject.keywordXBRL,iXBRL,Financial Dataset,Attribute Prediction,Multi-task Learning,en
dc.relation.page86-
dc.identifier.doi10.6342/NTU202501600-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-07-18-
dc.contributor.author-college管理學院-
dc.contributor.author-dept資訊管理學系-
dc.date.embargo-lift2025-07-22-
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf3.81 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved