多尺度變壓器模型於長時間序列預測之應用

陳泰佑; Tai-You Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96370

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王勝德	zh_TW
dc.contributor.advisor	Sheng-De Wang	en
dc.contributor.author	陳泰佑	zh_TW
dc.contributor.author	Tai-You Chen	en
dc.date.accessioned	2025-02-13T16:09:35Z	-
dc.date.available	2025-02-14	-
dc.date.copyright	2025-02-13	-
dc.date.issued	2025	-
dc.date.submitted	2025-02-08	-
dc.identifier.citation	Bryan Lim and Stefan Zohren. Time-series forecasting with deep learning: a survey. Philosophical Transactions of the Royal Society A, 379(2194):20200209, 2021. Jose´ F Torres, Dalil Hadjout, Abderrazak Sebaa, Francisco Mart´ınez-A´ lvarez, and Alicia Troncoso. Deep learning for time series forecasting: a survey. Big Data, 9(1):3–21, 2021. Ricardo P Masini, Marcelo C Medeiros, and Eduardo F Mendes. Machine learning advances for time series forecasting. Journal of economic surveys, 37(1):76–111, 2023. Omer Berat Sezer, Mehmet Ugur Gudelek, and Ahmet Murat Ozbayoglu. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied soft computing, 90:106181, 2020. Shun Liu, Kexin Wu, Chufeng Jiang, Bin Huang, and Danqing Ma. Financial timeseries forecasting: Towards synergizing performance and interpretability within a hybrid machine learning approach. arXiv preprint arXiv:2401.00534, 2023. Shruti Kaushik,Abhinav Choudhury, PankajKumar Sheron,Nataraj Dasgupta, Sayee Natarajan, Larry A Pickett, and Varun Dutt. Ai in healthcare: time-series forecasting using statistical, neural, and ensemble architectures. Frontiers in big data, 3:4, 2020. Pradeep Hewage, Marcello Trovati, Ella Pereira, and Ardhendu Behera. Deep learning-based effective fine-grained weather forecasting model. Pattern Analysis and Applications, 24(1):343–366, 2021. Konstantinos Nikolopoulos, Sushil Punia, Andreas Sch¨afers, Christos Tsinopoulos, and Chrysovalantis Vasilakis. Forecasting and planning during a pandemic: Covid-19 growth rates, supply chain disruptions, and governmental decisions. European journal of operational research, 290(1):99–115, 2021. Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun. Transformers in time series: A survey. arXiv preprint arXiv:2202.07125, 2022. Neo Wu, Bradley Green, Xue Ben, and Shawn O’Banion. Deep transformer models for time series forecasting: The influenza prevalence case. arXiv preprint arXiv:2001.08317, 2020. Ling Cai, Krzysztof Janowicz, Gengchen Mai, Bo Yan, and Rui Zhu. Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting. Transactions in GIS, 24(3):736–755, 2020. Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. Informer: Beyond efficient transformer for long sequence timeseries forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11106–11115, 2021. Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023. Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730, 2022. Jacob Devlin. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. Alexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. HaixuWu, Jiehui Xu, JianminWang, and Mingsheng Long. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in neural information processing systems, 34:22419–22430, 2021. Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International conference on machine learning, pages 27268–27286. PMLR, 2022. Sehoon Kim, Sheng Shen, David Thorsley, Amir Gholami, Woosuk Kwon, Joseph Hassoun, and Kurt Keutzer. Learned token pruning for transformers. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 784–794, 2022. Shuai Peng, Di Fu, Baole Wei, Yong Cao, Liangcai Gao, and Zhi Tang. Vote&mix: Plug-and-play token reduction for efficient vision transformer. arXiv preprint arXiv:2408.17062, 2024. Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Christoph Feichtenhofer, and Judy Hoffman. Token merging: Your vit but faster. arXiv preprint arXiv:2210.09461, 2022. LeonG¨otz, MarcelKollovieh, Stephan G¨unnemann, and Leo Schwinn. Efficient time series processing for transformers and state-space models through token merging. arXiv preprint arXiv:2405.17951, 2024. A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017. Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, and Yunhe Wang. Transformer in transformer. Advances in neural information processing systems, 34:15908–15919, 2021. Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, and Jaegul Choo. Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations, 2021. Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018. Jingjing Xu, CaesarWu, Yuan-Fang Li, and Pascal Bouvry. Transformer multivariate forecasting: Less is more? arXiv preprint arXiv:2401.00230, 2023.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96370	-
dc.description.abstract	時間序列預測的核心在於捕捉時間軸上的依賴性與趨勢。採用較長的輸入序列，不僅能幫助模型學習數據中的長期趨勢和週期性模式，還可以使其能更好地擬合的數據漂移。因此對於提升預測性能，使用長輸入序列為不可或缺的要素之一。然而，現有基於 Transformer 的模型在處理長輸入序列時，計算成本將顯著增加。為了解決此類問題，我們提出了一種高效的 Transformer 架構模型——MscTNT。該模型能以較低的計算成本處理更長的歷史數據窗口，透過將輸入序列分為大粒度切片與小粒度子切片，並結合雙層級 Transformer 編碼器堆疊的設計，增強模型的表徵能力。這種設計能夠聚合多尺度特徵，有效學習時間序列中的時序依賴性。MscTNT 的結構設計具有高度靈活性，能在預測精度與訓練及部署的時間空間成本之間實現良好的權衡。透過合理的參數設置，MscTNT 可以實現高效的 token reduction，大幅降低計算成本；而在較高計算成本的設定下，該模型亦能以低於其他模型所需成本的條件下，達到接近 SOTA 的預測精度。	zh_TW
dc.description.abstract	Time series forecasting focuses on capturing dependencies and trends across temporal sequences. Utilizing longer input sequences not only enables the model to learn long-term trends and periodic patterns within the data but also enhances its ability to model the evolving distribution drift. Consequently, the use of long input sequences is an essential factor in enhancing forecasting performance. However, when processing long input sequences, existing Transformer-based models face a significant increase in computational cost. To address this issue, we propose MscTNT, an efficient Transformer-based model capable of handling extended historical windows at reduced costs. MscTNT employs a dual-level Transformer encoder stack, which partitions the input sequence into coarse-grained patches and fine-grained subpatches to enhance representational capacity. This design facilitates the aggregation of multi-scale features and effectively captures temporal dependencies within the data. The structure of MscTNT ensures flexibility, enabling a balanced trade-off between predictive accuracy and the computational cost of training and deployment. By appropriately tuning the model parameters, MscTNT achieves efficient token reduction, substantially mitigating computational expenses. Alternatively, with higher-cost parameter settings, the model attains near state-of-the-art (SOTA) predictive accuracy at a lower computational overhead compared to other models.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-13T16:09:35Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-02-13T16:09:35Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書 i 致謝 ii 摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Transformer-based Forecasting Models 5 2.2 Token Reduction 7 Chapter 3 Approach 10 3.1 Preliminaries 10 3.2 Multi-scale Transformer in Transformer 16 3.2.1 Dilemma of Single Scale Patching Strategy 17 3.2.2 Problem Definition 17 3.2.3 Channel-wise Processing 18 3.2.4 Multi-scale Patching Strategy 19 3.2.5 Projection and Position Embedding 20 3.2.6 Multi-scale Transformers Block(MscT) 21 3.2.7 Loss Function 23 Chapter 4 Experiment 24 4.1 Experiment Details 24 4.1.1 Datasets Selection and Description 24 4.1.2 Experimental Environment 26 4.1.3 Baselines 26 4.1.4 Hyperparameters Settings 27 4.2 Multivariate and Univariate Forecasting 28 4.2.1 Performance Analysis in Multivariate Forecasting 28 4.2.2 Performance Analysis in Univariate Forecasting 29 4.2.3 Additional Observations from Experimental Results 29 4.3 Model Size Growth Problem 33 4.3.1 Different Predict Sequence Length 33 4.3.2 Different Look-back Sequence Length 37 4.4 Easy Trade-off 42 4.4.1 Trade-off Using Outer Token Dimensions 43 4.4.2 Trade-off Using Patch and Subpatch Size 44 4.5 Ablation Test 48 4.5.1 Strategy Effectiveness Verification 48 4.5.2 Channel Mixing vs Channel Independent 52 Chapter 5 Conclusion 55 References 56	-
dc.language.iso	en	-
dc.title	多尺度變壓器模型於長時間序列預測之應用	zh_TW
dc.title	MscTNT: Multi-Scale Transformer Model for Long Sequence Time Series Forecasting	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	李宏毅;于天立;雷欽隆;余承叡	zh_TW
dc.contributor.oralexamcommittee	Hung-Yi Lee;Tian-Li Yu;Chin-Laung Lei;Cheng-Rui Yu	en
dc.subject.keyword	時間序列預測,基於Transformer,Token Reduction,訓練及部屬成本,SOTA,	zh_TW
dc.subject.keyword	Time Series Forecasting,Transformer-based,Token Reduction,Computational Cost,SOTA,	en
dc.relation.page	59	-
dc.identifier.doi	10.6342/NTU202500515	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-02-09	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
dc.date.embargo-lift	2025-02-14	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf	2.95 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。