請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78717
標題: | 基於運算子分裂法定價雙資產歐式及美式選擇權的高能效硬體加速器 An Energy-Efficient Hardware Accelerator for Pricing Two-Asset European and American Options Based on Operator Splitting Methods |
作者: | Shih-Wei Hsieh 謝世暐 |
指導教授: | 盧奕璋(Yi-Chang Lu) |
關鍵字: | 布萊克-休斯方程式,雙資產美式選擇權定價,線性互補問題,有限差分法,運算子分裂法,硬體加速器,福克-普朗克方程式, Black-Scholes equation,two-asset American option pricing,linear complementarity problem,finite difference,operator splitting method,hardware accelerator,FPGA,Fokker-Planck equation, |
出版年 : | 2020 |
學位: | 碩士 |
摘要: | 高維度偏微分方程可用來描述許多現實生活中遇到的問題,包含金融工程中的選擇權定價問題。著名的多維布萊克-休斯方程式即是其中一個例子。此方程式描述了一個歐式的多資產選擇權隨時間變動的價格。而美式選擇權的定價方式,則需要求解一個包含這個偏微分方程的線性互補問題。對於這類問題,我們可以使用運算子分裂法來有效率地進行求解。 隨著市場中金融商品日益複雜,尋求高速與節能的定價方式變得更加重要。本論文提出一個基於布萊克-休斯模型,以運算子分裂法定價雙資產歐式及美式選擇權的硬體加速器。此硬體設計中包含一個求解三對角矩陣系統以及線性互補問題限制項的平行運算模組、一個矩陣乘法和資料位置轉置的平行運算模組、以及兩個矩陣係數產生器。透過直接在硬體上產生係數,可以節省原本需要在硬體上儲存所需的記憶體空間。我們也提出了管線化三對角矩陣演算法和矩陣乘法的方式,以達到更高的硬體使用效率和處理速度。 我們使用Altera Stratix V的FPGA板作為主要的計算平台,並透過PCIe進行資料傳輸。在考量硬體資源使用和結果的準確性下,我們選擇使用自定的浮點數長度來進行資料處理。在實驗結果上,本論文提出的硬體架構和運作於Intel i7-9700K CPU且優化過的多線程軟體版本相比,加速倍率為4.62倍到6.10倍,而能量效率更能達到32.9倍到43.4倍的進步。和運作於Nvidia RTX 2080ti GPU的相應軟體相比,硬體能有1.67倍到2.27倍的加速,且能量效率也有19.7到26.5倍的進步。 A wide range of real world problems can be modeled by multi-dimensional partial differential equations, including pricing the derivatives in financial engineering. The multi-dimensional Black-Scholes equation, which describes the time evolution of a multi-asset option price, is one well-known example. When considering the pricing of American options, the problem is formulated as a linear complementarity problem (LCP) involving the Black-Scholes partial differential operator. To efficiently solve this problem with finite difference methods, the operator splitting (OS) time discretization method is often applied. As financial products are getting increasingly complex, developing new high-speed and low-power pricing approaches is an important area to study. This thesis presents a hardware design for pricing the two-asset European and American options based on the Black-Scholes model using the OS method. The architecture consists of a solver for parallel tridiagonal matrix and LCP constraints, a calculator for parallel matrix multiplication and reshaping, and two coefficient generators. By producing coefficients on-chip instead of storing the entire matrix, we can reduce the overall memory usage. We also propose to pipeline the backward substitution step in the tridiagonal matrix solving and the matrix multiplication step to achieve higher hardware utilization and throughput. The target platform for our design uses the Altera Stratix V FPGA board, where the data transmission is supported by the PCIe interface. Wordlength analysis is conducted to choose the suitable custom floating-point formats while preserving sufficient accuracy. Compared to an optimized, multi-threaded CPU implementation running on Intel i7-9700K and a CUDA-based GPU implementation running on Nvidia RTX 2080ti, our FPGA design achieves a 4.62x to 6.10x speedup against CPU and 1.67x to 2.27x speedup against GPU. Considering energy efficiency, the FPGA design is about 32.9x to 43.4x more efficient than CPU and 19.7x to 26.5x more efficient than the GPU counterpart. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78717 |
DOI: | 10.6342/NTU202004240 |
全文授權: | 有償授權 |
電子全文公開日期: | 2023-10-21 |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-0610202002395100.pdf 目前未授權公開取用 | 15.29 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。