應用大型語言模型與提示設計於股市非結構性資料極性分析

黃冠霖; Kuan-Lin Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88885

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳永耀	zh_TW
dc.contributor.advisor	Yung-Yaw Chen	en
dc.contributor.author	黃冠霖	zh_TW
dc.contributor.author	Kuan-Lin Huang	en
dc.date.accessioned	2023-08-16T16:12:11Z	-
dc.date.available	2023-11-10	-
dc.date.copyright	2023-08-16	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-08	-
dc.identifier.citation	彭冠文(2014)，注意力對投資人股票買賣行為之研究，《淡江大學會計學系碩士班學位論文》，1-88。李承謙(2014)，國內頭版新聞資訊與台灣股市交易行為之關聯性，《中正大學會計與資訊科技學系學位論文》陳昱豪 (2017)，考慮投資人情緒下新聞對成交量之影響，《臺灣大學財務金融學研究所學位論文》，1-80。林智勇 (2015)，投資人注意力與營收宣告的效果，《臺北大學企業管理學系學位論文》。王釗東 (2017),以大數據探究財經新聞對台灣股票市場表現之影響，《臺灣大學新聞研究所學位論文》 Loughran, Tim, and Bill McDonald. "When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks." The Journal of finance 66.1 (2011): 35-65. 林宜萱. "財經領域情緒辭典之建置與其有效性之驗證-以財經新聞為元件." (2013): 1-60. Ma, Wei-Yun, and Keh-Jiann Chen. "Introduction to CKIP Chinese word segmentation system for the first international Chinese word segmentation bakeoff." Proceedings of the second SIGHAN workshop on Chinese language processing. 2003. Ma, Wei-Yun, and Keh-Jiann Chen. "Design of CKIP Chinese word segmentation system." Chinese and Oriental Languages Information Processing Society 14.3 (2005): 235-249. Tetlock, Paul C. "Giving content to investor sentiment: The role of media in the stock market." The Journal of finance 62.3 (2007): 1139-1168. Barber, Brad M., and Terrance Odean. "All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors." The review of financial studies 21.2 (2008): 785-818. Tetlock, Paul C., Maytal Saar‐Tsechansky, and Sofus Macskassy. "More than words: Quantifying language to measure firms' fundamentals." The journal of finance 63.3 (2008): 1437-1467. Ghiassi, Manoochehr, James Skinner, and David Zimbra. "Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network." Expert Systems with applications 40.16 (2013): 6266-6282. Li, Nan, et al. "Network environment and financial risk using machine learning and sentiment analysis." Human and Ecological Risk Assessment 15.2 (2009): 227-252. Wang, Gang, et al. "Crowds on wall street: Extracting value from collaborative investing platforms." Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 2015. Atzeni, Mattia, Amna Dridi, and Diego Reforgiato Recupero. "Fine-grained sentiment analysis on financial microblogs and news headlines." Semantic Web Challenges: 4th SemWebEval Challenge at ESWC 2017, Portoroz, Slovenia, May 28-June 1, 2017, Revised Selected Papers. Springer International Publishing, 2017. Agaian, Sarkis, and Petter Kolm. "Financial sentiment analysis using machine learning techniques." International Journal of Investment Management and Financial Innovations 3.1 (2017): 1-9. Sohangir, Sahar, Nicholas Petty, and Dingding Wang. "Financial sentiment lexicon analysis." 2018 IEEE 12th international conference on semantic computing (ICSC). IEEE, 2018. Sohangir, Sahar, et al. "Big Data: Deep Learning for financial sentiment analysis." Journal of Big Data 5.1 (2018): 1-25. Zhang, Lei, Shuai Wang, and Bing Liu. "Deep learning for sentiment analysis: A survey." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8.4 (2018): e1253. Tang, Duyu, Bing Qin, and Ting Liu. "Document modeling with gated recurrent neural network for sentiment classification." Proceedings of the 2015 conference on empirical methods in natural language processing. 2015. Tai, Kai Sheng, Richard Socher, and Christopher D. Manning. "Improved semantic representations from tree-structured long short-term memory networks." arXiv preprint arXiv:1503.00075 (2015). Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). Zhang, Xiang, Junbo Zhao, and Yann LeCun. "Character-level convolutional networks for text classification." Advances in neural information processing systems 28 (2015). Johnson, Rie, and Tong Zhang. "Deep pyramid convolutional neural networks for text categorization." Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. Yang, Zichao, et al. "Hierarchical attention networks for document classification." Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. 2016. Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems 26 (2013). Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. Bojanowski, Piotr, et al. "Enriching word vectors with subword information." Transactions of the association for computational linguistics 5 (2017): 135-146. Le, Quoc, and Tomas Mikolov. "Distributed representations of sentences and documents." International conference on machine learning. PMLR, 2014. Kiros, Ryan, et al. "Skip-thought vectors." Advances in neural information processing systems 28 (2015). Conneau, Alexis, et al. "Supervised learning of universal sentence representations from natural language inference data." arXiv preprint arXiv:1705.02364 (2017). Artetxe, Mikel, and Holger Schwenk. "Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond." Transactions of the Association for Computational Linguistics 7 (2019): 597-610. Man, Xiliu, Tong Luo, and Jianwu Lin. "Financial sentiment analysis (fsa): A survey." 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS). IEEE, 2019. Yang, Steve, Jason Rosenfeld, and Jacques Makutonin. "Financial aspect-based sentiment analysis using deep representations." arXiv preprint arXiv:1808.07931 (2018). Du, Chi-Han, Ming-Feng Tsai, and Chuan-Ju Wang. "Beyond word-level to sentence-level sentiment analysis for financial reports." ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019. Zhao, Lingyun, et al. "A BERT based sentiment analysis and key entity detection approach for online financial texts." 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 2021. Howard, Jeremy, and Sebastian Ruder. "Universal language model fine-tuning for text classification." arXiv preprint arXiv:1801.06146 (2018). Hagenau, Michael, Michael Liebmann, and Dirk Neumann. "Automated news reading: Stock price prediction based on financial news using context-capturing features." Decision support systems 55.3 (2013): 685-697. Atkins, Adam, Mahesan Niranjan, and Enrico Gerding. "Financial news predicts stock market volatility better than close price." The Journal of Finance and Data Science 4.2 (2018): 120-137. Nam, KiHwan, and NohYoon Seong. "Financial news-based stock movement prediction using causality analysis of influence in the Korean stock market." Decision Support Systems 117 (2019): 100-112. Schumaker, Robert P., and Hsinchun Chen. "A quantitative stock prediction system based on financial news." Information Processing & Management 45.5 (2009): 571-583. Mishev, Kostadin, et al. "Evaluation of sentiment analysis in finance: from lexicons to transformers." IEEE access 8 (2020): 131662-131682. Lopez-Lira, Alejandro, and Yuehua Tang. "Can chatgpt forecast stock price movements? return predictability and large language models." arXiv preprint arXiv:2304.07619 (2023). Hansen, Anne Lundgaard, and Sophia Kazinnik. "Can ChatGPT Decipher Fedspeak?." Available at SSRN (2023). Cowen, Tyler, and Alexander T. Tabarrok. "How to learn and teach economics with large language models, including GPT." Including GPT (March 17, 2023) (2023). Korinek, Anton. Language models and cognitive automation for economic research. No. w30957. National Bureau of Economic Research, 2023. Noy, Shakked, and Whitney Zhang. "Experimental evidence on the productivity effects of generative artificial intelligence." Available at SSRN 4375283 (2023). Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901. Wei, Jason, et al. "Chain-of-thought prompting elicits reasoning in large language models." Advances in Neural Information Processing Systems 35 (2022): 24824-24837. Ouyang, Long, et al. "Training language models to follow instructions with human feedback." Advances in Neural Information Processing Systems 35 (2022): 27730-27744. OpenAI. "GPT-4 Technical Report." ArXiv abs/2303.08774 (2023): n. pag. Stiennon, Nisan, et al. "Learning to summarize with human feedback." Advances in Neural Information Processing Systems 33 (2020): 3008-3021. Manning, Christopher D., et al. "The Stanford CoreNLP natural language processing toolkit." Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. 2014. Radford, Alec, et al. "Improving language understanding by generative pre-training." (2018). Klein, Dan, and Christopher D. Manning. "Accurate unlexicalized parsing." Proceedings of the 41st annual meeting of the association for computational linguistics. 2003. Wang, Jenq-Haur, Hung-Hsuan Wang, and Long Wang. "Dependency parsing of financial news to improve sentiment analysis for predicting market prices." 2020 International Conference on Technologies and Applications of Artificial Intelligence (TAAI). IEEE, 2020. Lin, Chin-Yew. "Rouge: A package for automatic evaluation of summaries." Text summarization branches out. 2004. Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017). Tetlock, Paul C., Maytal Saar‐Tsechansky, and Sofus Macskassy. "More than words: Quantifying language to measure firms' fundamentals." The journal of finance 63.3 (2008): 1437-1467. Tetlock, Paul C. "All the news that's fit to reprint: Do investors react to stale information?" The Review of Financial Studies 24.5 (2011): 1481-1512. Xue, Linting, et al. "mT5: A massively multilingual pre-trained text-to-text transformer." arXiv preprint arXiv:2010.11934 (2020). Zhang, Jingqing, et al. "Pegasus: Pre-training with extracted gap-sentences for abstractive summarization." International Conference on Machine Learning. PMLR, 2020. Hasan, Tahmid, et al. "XL-sum: Large-scale multilingual abstractive summarization for 44 languages." arXiv preprint arXiv:2106.13822 (2021). Bhattacharjee, Abhik, et al. "Crosssum: Beyond English-centric cross-lingual abstractive text summarization for 1500+ language pairs." arXiv preprint arXiv:2112.08804 (2021). Wang, Junjie, et al. "Fengshenbang 1.0: Being the foundation of chinese cognitive intelligence." arXiv preprint arXiv:2209.02970 (2022).	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88885	-
dc.description.abstract	現代的金融市場中，股票交易已經是普遍的投資方式。股價會隨時間變動而有所漲跌，造成股價變化的原因有許多種，例如政治性因素、經濟性因素或公司性因素等。除了前述提到的因素之外，在網路發達的今日，新聞傳播十分快速，新聞受眾也十分廣泛，從網路取得新聞對一般投資人來說非常的方便。在閱讀新聞後，新聞中所提及的公司相關資訊，會對投資人產生影響，進而做出不同的投資決策，因此新聞對於金融市場波動影響與日俱增。新聞中不僅可以提供數值的資料，更可以提供隱含在文字裡的情緒。而近期在大型語言模型的發展上也取得了大幅的進展，大型語言模型對於文意的理解能力使得對於金融新聞做情感分析的方法可以有新的發展。本研究旨在對網路上的財經新聞進行分析，瞭解使用不同方式對新聞分析的成效。在本文中會使用基於演算法的方式抽取新聞中與特定公司有關的新聞段落對新聞做解析，另外也會使用句法分析及大型語言模型ChatGPT對新聞做分析，並比較三者之間的差異。	zh_TW
dc.description.abstract	In the modern financial market, stock trading has become a common investment method. The stock price will rise or fall over time, and there are many reasons for the stock price change, such as political factors, economic factors or corporate factors. In addition to the factors mentioned above, with the development of the Internet today, news dissemination is very fast, and news audiences are also very wide. Obtaining news from the Internet is very convenient for ordinary investors. After reading the news, the company-related information mentioned in the news will have an impact on investors, and then make different investment decisions. Therefore, the impact of news on financial market fluctuations is increasing day by day. The news can not only provide numerical data, but also provide the emotions implied in the text. Recently, substantial progress has been made in the development of Large Language Models. The ability of Large Language Models to understand the meaning of text enables new developments in the method of sentiment analysis for financial news. This study aims to analyze financial news on the Internet and understand the effectiveness of using different methods to analyze news. In this article, an algorithm-based method will be used to extract news paragraphs related to a specific company in the news to analyze the news. In addition, syntactic analysis and a Large Language Model ChatGPT will be used to analyze the news, and the differences between the three will be compared.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-16T16:12:11Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-16T16:12:11Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	目錄口試委員會審定書 i 誌謝 ii 摘要 iii ABSTRACT iv 目錄 v 圖目錄 vii 表目錄 viii 第一章介紹 1 1.1 研究動機 2 1.2 研究方法 3 1.3 研究架構 4 第二章相關研究整理 6 2.1 金融辭典之相關研究整理 6 2.2 使用Stanford Parser對新聞進行解析之相關研究整理 7 2.3 使用ChatGPT對新聞進行極性分析之相關研究整理 8 第三章研究方法 9 3.1 新聞收集 9 3.2 人工標注新聞 11 3.2.1 新聞資料集介紹 11 3.2.2 新聞人工標注 12 3.2.3 新聞段落標注與情感標注範例 14 3.2.4 根據新聞內容與聯電的相關程度分成五個類別 16 3.3 自動標注新聞 20 3.4 使用Stanford Parser對新聞進行解析 22 3.5 使用ChatGPT對新聞進行解析與極性分析 27 3.5.1 Zero-Shot Prompting 27 3.5.2 Few-Shot Prompting 28 3.5.3 Chain-of-Thought Prompting 29 3.5.4 Role Prompting 30 3.5.5 ChatGPT API 30 3.5.6 ChatGPT Prompt設計流程 30 3.5.7 將新聞中與聯電相關的資訊抓取出來 31 3.5.8 篩選新聞 32 3.5.9 使用ChatGPT設計提示對新聞進行評分 36 3.6 金融情感辭典 39 3.7 新聞情感指標 41 3.7.1 每則新聞情感指標 41 第四章實驗結果 43 4.1 Cosine Similarity 43 4.2 Rouge-N 44 4.3 新聞解析成果分析 45 4.4 新聞篩選成果分析 49 4.5 給予新聞情感分數成果分析 51 4.5.1 ChatGPT對新聞進行評分與人工情感標記比較 51 4.5.2 五種新聞極性分析方法與人工情感標記比較 52 第五章結論 56 參考文獻 58	-
dc.language.iso	zh_TW	-
dc.subject	情感分析	zh_TW
dc.subject	金融情感辭典	zh_TW
dc.subject	成份句法分析	zh_TW
dc.subject	大型語言模型	zh_TW
dc.subject	文本分析	zh_TW
dc.subject	Constituency parsing	en
dc.subject	Finance sentiment dictionary	en
dc.subject	Sentiment analysis	en
dc.subject	Textual analysis	en
dc.subject	Large Language Models	en
dc.title	應用大型語言模型與提示設計於股市非結構性資料極性分析	zh_TW
dc.title	Application of Large Language Model and Prompt Design on Stock Market Unstructured Data to Analyze News Polarity	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	張智星;葉倚任	zh_TW
dc.contributor.oralexamcommittee	Jyh-Shing Jang;Yi-Ren Yeh	en
dc.subject.keyword	金融情感辭典,成份句法分析,大型語言模型,文本分析,情感分析,	zh_TW
dc.subject.keyword	Finance sentiment dictionary,Constituency parsing,Large Language Models,Textual analysis,Sentiment analysis,	en
dc.relation.page	66	-
dc.identifier.doi	10.6342/NTU202303624	-
dc.rights.note	未授權	-
dc.date.accepted	2023-08-10	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	2.15 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。