請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90120
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 莊裕澤 | zh_TW |
dc.contributor.advisor | Yuh-Jzer Joung | en |
dc.contributor.author | 吳晨瑋 | zh_TW |
dc.contributor.author | Chen-Wei Wu | en |
dc.date.accessioned | 2023-09-22T17:29:52Z | - |
dc.date.available | 2023-11-09 | - |
dc.date.copyright | 2023-09-22 | - |
dc.date.issued | 2023 | - |
dc.date.submitted | 2023-08-08 | - |
dc.identifier.citation | Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
Carlson, M. (2015). The robotic reporter: Automated journalism and the redefinition of labor, compositional forms, and journalistic authority. Digital journalism, 3(3):416– 431. Celikyilmaz, A., Clark, E., and Gao, J. (2020). Evaluation of text generation: A survey. arXiv preprint arXiv:2006.14799. Chang, E., Shen, X., Zhu, D., Demberg, V., and Su, H. (2021). Neural data-to-text generation with lm-based text augmentation. arXiv preprint arXiv:2102.03556. Chen, W., Su, Y., Yan, X., and Wang, W. Y. (2020). Kgpt: Knowledge-grounded pre-training for data-to-text generation. arXiv preprint arXiv:2010.02307. Delen, D., Kuzey, C., and Uyar, A. (2013). Measuring firm performance using financial ratios: A decision tree approach. Expert systems with applications, 40(10):3970–3983. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Dhingra, B., Faruqui, M., Parikh, A., Chang, M.-W., Das, D., and Cohen, W. W. (2019). Handling divergent reference texts when evaluating table-to-text generation. arXiv preprint arXiv:1906.01081. Dušek, O., Novikova, J., and Rieser, V. (2020). Evaluating the state-of-the-art of end-to-end natural language generation: The e2e nlg challenge. Computer Speech & Language, 59:123–156. Ferreira, T. C., van der Lee, C., Van Miltenburg, E., and Krahmer, E. (2019). Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. arXiv preprint arXiv:1908.09022. Gardent, C., Shimorina, A., Narayan, S., and Perez-Beltrachini, L. (2017a). Creating training corpora for nlg micro-planning. In 55th annual meeting of the Association for Computational Linguistics (ACL). Gardent, C., Shimorina, A., Narayan, S., and Perez-Beltrachini, L. (2017b). The webnlg challenge: Generating text from rdf data. In Proceedings of the 10th International Conference on Natural Language Generation, pages 124–133. Gatt, A. and Krahmer, E. (2018). Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61:65–170. Geva, M., Gupta, A., and Berant, J. (2020). Injecting numerical reasoning skills into language models. arXiv preprint arXiv:2004.04487. Gkatzia, D. (2016). Content selection in data-to-text systems: A survey. arXiv preprint arXiv:1610.08375. Kale, M. and Rastogi, A. (2020). Text-to-text pre-training for data-to-text tasks. arXiv preprint arXiv:2005.10433. Lebret, R., Grangier, D., and Auli, M. (2016). Neural text generation from structured data with application to the biography domain. arXiv preprint arXiv:1603.07771. Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., and Zettlemoyer, L. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461. Li, L. and Wan, X. (2018). Point precisely: Towards ensuring the precision of data in generated texts using delayed copy mechanism. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1044–1055. Liang, P., Jordan, M. I., and Klein, D. (2009). Learning semantic correspondences with less supervision. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 91–99. Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81. Liu, T., Wang, K., Sha, L., Chang, B., and Sui, Z. (2018). Table-to-text generation by structure-aware seq2seq learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32. Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., Lewis, M., and Zettlemoyer, L. (2020). Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742. Moryossef, A., Goldberg, Y., and Dagan, I. (2019). Step-by-step: Separating planning from realization in neural data-to-text generation. arXiv preprint arXiv:1904.03396. Nesterenko, L. (2016). Building a system for stock news generation in russian. In Proceedings of the 2nd international workshop on natural language generation and the semantic web (webnlg 2016), pages 37–40. Novikova, J., Dušek, O., and Rieser, V. (2017). The e2e dataset: New challenges for end-to-end generation. arXiv preprint arXiv:1706.09254. OpenAI (2022). Chatgpt. Papineni, K., Roukos, S., Ward, T., and Zhu, W.J. (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318. Parikh, A. P., Wang, X., Gehrmann, S., Faruqui, M., Dhingra, B., Yang, D., and Das, D. (2020). Totto: A controlled table-to-text generation dataset. arXiv preprint arXiv:2004.14373. Plachouras, V., Smiley, C., Bretz, H., Taylor, O., Leidner, J. L., Song, D., and Schilder, F. (2016). Interacting with financial data using natural language. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 1121–1124. Puduppully, R., Dong, L., and Lapata, M. (2019). Data-to-text generation with content selection and planning. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 6908–6915. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8):9. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551. Rebuffel, C., Soulier, L., Scoutheeten, G., and Gallinari, P. (2020). A hierarchical model for data-to-text generation. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42, pages 65–80. Springer. Reiter, E. (2007). An architecture for data-to-text systems. In proceedings of the eleventh European workshop on natural language generation (ENLG 07), pages 97–104. Reiter, E. and Dale, R. (2000). Building natural generation systems studies in natural language processing. Sai, A. B., Mohankumar, A. K., and Khapra, M. M. (2022). A survey of evaluation metrics used for nlg systems. ACM Computing Surveys (CSUR), 55(2):1–39. Suadaa, L. H., Kamigaito, H., Funakoshi, K., Okumura, M., and Takamura, H. (2021). Towards table-to-text generation with numerical reasoning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1451–1465. Thawani, A., Pujara, J., and Ilievski, F. (2021). Numeracy enhances the literacy of language models. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 6960–6967. Wiseman, S., Shieber, S. M., and Rush, A. M. (2017). Challenges in data-to-document generation. arXiv preprint arXiv:1707.08052. Xia, Y., Tian, F., Wu, L., Lin, J., Qin, T., Yu, N., and Liu, T.-Y. (2017). Deliberation networks: Sequence generation beyond one-pass decoding. Advances in neural information processing systems, 30. Xiang, J., Liu, Z., Zhou, Y., Xing, E. P., and Hu, Z. (2022). Asdot: Any-shot data-to-text generation with pretrained language models. arXiv preprint arXiv:2210.04325. Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., and Raffel, C. (2020). mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934. Yermakov, R., Drago, N., and Ziletti, A. (2021). Biomedical data-to-text generation via fine-tuning transformers. arXiv preprint arXiv:2109.01518. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90120 | - |
dc.description.abstract | Data-to-text Generation 研究在近年獲得越來越多關注,因其注重文本流暢及資訊正確的特性非常適合用於新聞自動化,因此在以英文為主的自然語言研究中發展得相當蓬勃。然而在中文領域卻仍處於起步的階段,原因在於 Data-to-Text 任務的來源資料會因為應用領域有非常大的結構變化,因此在文本的標注上需要耗費大量人力才能得到一個適用於特定領域的訓練資料。此外當 Data-to-Text 任務的來源資料涉及大量數值時,利用語言模型進行理解及生成的學習目前也存在需多挑戰。有鑑於此,本論文提出了一個 Any-Shot 的 Data-to-Text Generation 訓練方式,並透過對語言模型進行數值增益預訓練,來提升中文財經領域的 Data-to-Text 表現。
本論文透過在 Sequence-to-Sequence 模型中導入mT5系列的預訓練模型,並以台灣經濟新報資料庫(TEJ+)的中文財務報表以及 ChatGPT 對目標文本進行增益來進行模型的訓練及驗證,達成中文財經領域 Data-to-Text 生成的第一個研究。此外更進一步透過 Two-Step Pre-Training 及 Numerical Representation Augmentation 兩種方法提升模型理解及生成數值的表現。最後在自動評估方法以及人工評估方法上,本論文提出的輕量型模型架構在文本的評估上可以達到與ChatGPT近似的表現,並且在數值精確率的表現上更可以超越ChatGPT的生成結果。 | zh_TW |
dc.description.abstract | Data-to-text generation research has been gaining increasing attention in recent years due to its focus on generating fluent and accurate text, making it highly suitable for news automation. As a result, it has seen significant development in English-dominated natural language research. However, in the Chinese domain, it is still in its early stages. This is primarily because the source data for data-to-text tasks can vary significantly depending on the application domain, requiring substantial human effort to obtain domain-specific training data with appropriate annotations. Additionally, when the source data for data-to-text tasks involves a large amount of numerical information, there are additional challenges in utilizing language models for comprehension and generation. Given these challenges, this paper proposes an any-shot training approach for data-to-text generation and leverages numerical augmentation pre-training of language models to enhance the performance of Chinese financial data-to-text generation.
The paper introduces mT5-based pre-trained models into a sequence-to-sequence framework and trains and validates the model using Chinese financial statements from the TEJ+ database and ChatGPT's enhanced target text. This study achieves the first research on Chinese financial data-to-text generation. Furthermore, two-step pre-training and numerical representation augmentation methods are employed to improve the model's comprehension and generation of numerical information. Finally, in both automatic and manual evaluation methods, the lightweight model architecture proposed in this paper achieves comparable performance to ChatGPT in text evaluation and outperforms ChatGPT in numerical precision of generated results. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-22T17:29:52Z No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2023-09-22T17:29:52Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | 口試委員審定書 i
致謝 ii 摘要 iii Abstract iv 目錄 vi 圖目錄 ix 表目錄 x 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 4 1.3 論文架構 5 第二章 文獻探討 6 2.1 財經文本 7 2.2 數據轉文本生成 8 2.2.1 Pipeline架構 8 2.2.2 End-to-End架構 10 2.2.3 Pipeline架構與E2E架構的比較 11 2.2.4 Pipelined Neural架構 12 2.2.4.1 Template Generation 13 2.2.4.2 Slot Filling 13 2.2.5 基於預訓練模型的End-to-End架構 14 2.3 Data-to-Text Generation預訓練模型 15 2.4 語言模型的數值理解能力 17 2.5 數據轉文本評估方法 18 2.5.1 數據轉文本評估方法 18 2.5.2 自動評估指標 19 2.5.2.1 BLEU 19 2.5.2.2 ROUGE 19 2.5.2.3 PARENT 20 2.6 總結 21 第三章 研究方法 23 3.1 研究架構 23 3.2 FinD2T_zh資料集 25 3.2.1 財務報表資料 25 3.2.1.1 Content Augmentation 26 3.2.1.2 Content Selection 27 3.2.2 參照文本 28 3.2.3 資料集特性比較 30 3.3 模型架構 32 3.3.1 來源資料前處理 32 3.3.2 數值表示法增益 33 3.3.3 加入預訓練模型的Sequence-to-Sequence Model 34 3.4 研究驗證 35 第四章 研究結果 4.1 實驗設定 38 4.2 實驗超參數設定 39 4.3 自動評估結果 39 4.3.1 預訓練模型的影響 39 4.3.2 輸入表示法的影響 41 4.3.3 數值理解增益的影響 42 4.3.4 Numerical Precision與Numerical Error Types 43 4.4 人工評估結果 45 4.5 生成文本分析 48 4.5.1 輸入表示法差異 48 4.5.2 數值理解增益法差異 51 第五章 結論 53 5.1 研究結果 53 5.2 研究貢獻 54 5.3 研究限制 54 5.4 未來研究方向 55 參考文獻 57 | - |
dc.language.iso | zh_TW | - |
dc.title | 基於序列到序列方法的中文財經數據轉文本生成 | zh_TW |
dc.title | A Seq-to-Seq Approach for Chinese Financial Data-to-Text Generation | en |
dc.type | Thesis | - |
dc.date.schoolyear | 111-2 | - |
dc.description.degree | 碩士 | - |
dc.contributor.oralexamcommittee | 魏志平;陳文華 | zh_TW |
dc.contributor.oralexamcommittee | Chih-Ping Wei;Wun-Hwa Chen | en |
dc.subject.keyword | 數據轉文本生成,自然語言生成,中文財經文本,大型語言模型, | zh_TW |
dc.subject.keyword | Data-to-Text Generation,Natural Language Generation,Financial Context,Large Language Pre-trained Model, | en |
dc.relation.page | 62 | - |
dc.identifier.doi | 10.6342/NTU202303640 | - |
dc.rights.note | 同意授權(限校園內公開) | - |
dc.date.accepted | 2023-08-10 | - |
dc.contributor.author-college | 管理學院 | - |
dc.contributor.author-dept | 資訊管理學系 | - |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 目前未授權公開取用 | 3.82 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。