Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 國際企業學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69466
標題: 運用文字探勘分析市場消息中關鍵詞彙與臺灣股市之關聯性--以科技股、食品股為例
An analysis in the relations between the key phrases and
the stock performance in Taiwanese stock market
(e.g., Technology stocks and food industry stocks)
作者: Hung-Chieh Lee
李宏杰
指導教授: 洪茂蔚(Mao-Wei Hung, Ph.D.)
關鍵字: 消息面,關鍵詞彙,語意,資料探勘,漲跌幅標記,股價浮動,
data mining,text mining,forecast,stock price,nouns,verbs,
出版年 : 2018
學位: 碩士
摘要: 在現今臺灣股市,基本分析及技術分析經常是投資大眾預測股市表現的分析方式,至今有許多學者也是採取此兩種方式進行預測股價,但仍有預測上的諸多難度及限制,如分析區間太短及短期波動等問題難以處理,對於公司內部行動較難知情,在行為見解上,如減資不一定是彌補虧損,也可能是公司暫時找不到適合投資機會等之類解釋,以及與許多指標上運用情況及量化的問題。在實務上,對於投資策略或特定條件的狀況,此二分析方式參考價值雖有值得參考之處,但充分解釋上仍是有限。
臺股過去常被普遍認為是接近低效率市場,因此許多投資人希望盡快取得市場上流通第一手資訊,輔助判斷當下決策,本研究也納入此假設。然而因為過去研究常產生選用參考詞庫上的客觀性問題、傳播媒體的解讀及文字選詞不盡相同,以及資料數據十分龐大。因此消息面預測股市趨勢是有其難度的,需考慮許多複雜因素。
為了解決上述難題,過往有不少研究多以取得網路上消息條目以進行抽樣及斷詞分析。本研究主以Pandas toolbox功能篩選同類別條件的個股( 例如:科技股-臺積電-宏達電 ),並取得公開資訊觀測站上個股歷史重大消息條目( 取得每日條目分析,考慮因許多臺股重大消息反應在周一開盤或當日晨間公布時 ),依消息發生之時間點( 過去3年內,從今年4/10回推)比對該股日收盤價歷史走動,針對消息後上漲、持平及下跌以進行漲跌幅標記,彙整後以中研院資訊科學所開發之詞庫中文斷詞系統進行斷詞,以及利用结巴中文分词系統( Python內建Jieba套件)進行關鍵字偵測、出現頻率、詞性篩選及配對,尋找與該類別股市漲跌表現具因果關係的關鍵詞彙並加以評分( 如科技股:成本價格上揚、擴廠、重挫等),以期建立幫助市場投資者往後觀察並預測股價浮動的另一參考方式。另外,本研究也提出了利用SVD奇異值分解,對消息中文字進行因果預測,以建立字詞判斷漲跌的分類字庫,嘗試解決過去研究參考題庫客觀性的問題。
實證結果顯示,除了宏達電利用TF-IDF法,篩選出關鍵詞內容無法有效預測未來收盤價走勢,其他三間公司:臺積電、統一、味全於公開資訊觀測站公布之重大訊息條目,包含標題及詳細資料,從其內容篩選關鍵字(10個以內)大致可預測收盤價走勢。
Nowadays, the fundamental and technical analysis are often used as analysis techniques by publics in Taiwanese stock market. Related research from many scholars present that both techniques take great part in the research progress, but utter difficulties and restrictions are coming up in the forecasting stock prices. For examples, scholars found it hard to solve the problems such as short analysis interval, short-term fluctuations or difficulties to know the detailed information from policies in companies. Also, there are more measuring problems about setting indexes. In practice, when it comes to forming up an investment strategy or entering certain conditions, both analysis prove themselves as valuable methods to forecast the market trend, but they are still insufficient to explain the trend.
The Taiwanese stock market are often considered as a nearly-inefficient market. Many investors desire to gather first-hand information as soon as possible in order to help decision in the meantime. In this research, this hypothesis also exists. However, it is hard to forecast the market by the information-gathering approach due to the lack in time and several issues, like huge data range, the different choosing ways in the text. In short, many complicated factors needed to be concerned.
In order to solve these problems, many scholars applied the text mining approach to help analyze the performance in the stock market and sample the needed text on the website. This research will be mainly focus on data mining by using ( Python 3.6 ) Pandas toolbox techniques to select the stocks in the same categories. For example, TSMC and HTC belongs to technology stocks. We will choose the Taiwanese market observation post system to capture daily historical message on the three year basis and compare with the counterpart daily stock price. Then, mark positive and minus on the selective nouns and verbs to help the latter analysis. We will use CKIP and Jeiba system in the progress of this research. By finding the key words and valuation, we can improve this way of forecasting stock price and help investors understand more details about text mining in forecasting stock prices.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69466
DOI: 10.6342/NTU201801080
全文授權: 有償授權
顯示於系所單位:國際企業學系

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
7.68 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved