Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工業工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98364
標題: 整合大型語言模型情感分析與機器學習之股票漲跌預測方法
A Stock Movement Prediction Approach Integrating Large Language Model Sentiment Analysis and Machine Learning
作者: 洪嘉濃
Chia-Nung Hung
指導教授: 郭瑞祥
Ruey-Shan Guo
關鍵字: 大型語言模型,情感分析,機器學習,股票漲跌預測,資料挖掘,
Large Language Model (LLM),Sentiment Analysis,Machine Learning,Stock Movement Prediction,Data Mining,
出版年 : 2025
學位: 碩士
摘要: 股票市場作為資本流動與價值交換的核心平台,深受國內外經濟情勢與突發事件影響,股價變動日益劇烈。近年來,隨著資訊傳播速度大幅提升,傳統依賴技術指標與基本面數據的預測方法,常難以即時反映市場情緒與潛在風險訊號,對事件驅動型波動尤為敏感。為了捕捉非結構化新聞資料所蘊含的市場預期與投資人情緒,自然語言處理技術逐漸被導入股價預測領域。

另一方面,自 2022 年底 OpenAI 推出 ChatGPT 後,大型語言模型展現出的卓越語意理解與生成能力逐漸被世人重視,其不僅能夠執行情感判斷、實體辨識、摘要生成等任務,更能在缺乏大量標註資料的情況下,推論深層語意與市場情緒。LLM 的問世逐漸取代傳統的自然語言處理技術,並使金融語言資料的解析更加精準,為事件驅動交易與股價預測開啟嶄新可能。

本研究即旨在整合由 LLM 的情緒萃取技術以及機器學習的股票漲跌預測能力,以提升模型在精確率與召回率間的整體平衡表現。考量台灣股市以高科技產業為主、受全球政治與供應鏈事件高度影響,選擇台積電與其他具代表性的上市公司為研究對象,蒐集技術指標、股票新聞文本與 LLM 輸出之情感分析分數等時序性資料。本研究除了像以往論文直接使用語言模型取得情感分數外,亦嘗試設計一系列的新聞模組化處理方法。透過 GPT 模型執行新聞摘要與關鍵字抽取,進一步以詞嵌入及主成分分析壓縮語意向量,生成具代表性的市場情緒特徵。將多種特徵整合後,由後續機器學習模型進行訓練,並以集成方法整合各模型的預測結果,以提升結果的準確性與穩定性。

根據本研究的實驗結果發現,融合 LLM 所提供的語意特徵確實能提升股價漲跌的預測能力,尤其在半導體等高度關注的產業中,更展現出對市場變動的敏銳反應能力。此外,本研究所設計的新聞模組化處理流程,可以有效地在最後提供特徵解釋性,幫助我們釐清新聞文本中影響股價漲跌的可能因素。研究成果證實大型語言模型在金融語意分析上的應用潛力,為未來智慧投資與風險預警系統提供重要參考。
The stock market, as a core platform for capital flow and value exchange, is highly sensitive to both domestic and global economic conditions as well as unexpected events, resulting in increasingly volatile price fluctuations. In recent years, with the rapid acceleration of information dissemination, traditional forecasting methods based on technical indicators and fundamental data often struggle to reflect market sentiment and potential risk signals in a timely manner—especially in the context of event-driven volatility. To capture the market expectations and investor sentiment embedded in unstructured news data, natural language processing (NLP) techniques have gradually been introduced into the field of stock price prediction.

Since the release of ChatGPT by OpenAI in late 2022, large language models (LLMs) have demonstrated remarkable capabilities in semantic understanding and text generation. These models are capable not only of performing sentiment analysis, entity recognition, and summarization, but also of inferring deep semantics and market sentiment without the need for extensive labeled data. The emergence of LLMs is gradually replacing traditional NLP techniques and enabling more precise analysis of financial textual data, opening up new possibilities for event-driven trading and stock price forecasting.

This study aims to integrate sentiment extraction powered by LLMs with machine learning-based stock movement prediction models to improve forecasting performance. Considering that Taiwan’s stock market is strongly influenced by high-tech industries and global geopolitical and supply chain events, this research focuses on TSMC and other representative listed companies. We collect time-series data including technical indicators, financial news texts, and sentiment scores output by LLMs. Unlike previous studies that directly use sentiment scores derived from language models, this study designs a series of modularized news processing steps. These include news summarization and keyword extraction via GPT models, followed by word embedding and principal component analysis (PCA) to compress semantic vectors and generate representative market sentiment features. These features are then integrated and used to train various machine learning models. An ensemble learning approach is employed to combine predictions and enhance both accuracy and robustness.

Experimental results show that incorporating semantic features extracted by LLMs significantly improves the ability to predict stock price movements, particularly evident in highly followed industries such as semiconductors, where the model demonstrates a more sensitive response to market fluctuations. Moreover, the proposed modularized news processing pipeline provides feature interpretability, helping to identify the key elements in news articles that drive stock price changes. The findings confirm the potential of large language models in financial sentiment analysis and offer valuable insights for the development of intelligent investment strategies and risk alert systems.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98364
DOI: 10.6342/NTU202501996
全文授權: 同意授權(限校園內公開)
電子全文公開日期: 2025-08-06
顯示於系所單位:工業工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
7.03 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved