請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38073完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 楊佳玲 | |
| dc.contributor.author | Chih-Chin Chang | en |
| dc.contributor.author | 張智錦 | zh_TW |
| dc.date.accessioned | 2021-06-13T16:26:08Z | - |
| dc.date.available | 2014-08-22 | |
| dc.date.copyright | 2011-08-22 | |
| dc.date.issued | 2011 | |
| dc.date.submitted | 2011-08-20 | |
| dc.identifier.citation | [1] http://www.gnuplot.info/.
[2] http://www.gtk.org/index.php. [3] V. Aslot, M. J. Domeika, R. Eigenmann, G. Gaertner, W. B. Jones, and B. Parady. Specomp: A new benchmark suite for measuring parallel computer performance. In Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming, WOMPAT ’01, pages 1–10, London, UK, 2001. SpringerVerlag. [4] R. Balasubramonian, D. Albones, A. Buyuktosunoglu, and S. Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In Microarchitecture, 2000. MICRO-33. Proceedings. 33rd Annual IEEE/ACM International Symposium on, pages 245 –257, 2000. [5] R. Balasubramonian, S. Dwarkadas, and D. H. Albonesi. Dynamically managing the communication-parallelism trade-off in future clustered processors. In Proceedings of the 30th annual international symposium on Computer architecture, ISCA ’03, pages 275–287, New York, NY, USA, 2003. ACM. [6] C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT ’08, pages 72–81, New York, NY, USA, 2008. ACM. [7] R. Cochran and S. Reda. Consistent runtime thermal prediction and control through workload phase detection. In Proceedings of the 47th Design Automation Conference, DAC ’10, pages 62–67, New York, NY, USA, 2010. ACM. [8] B. Davies, J. Bouguet, M. Polito, and M. Annavaram. ipart: An automated phase detection and recognition tool. Technical report, 2003. [9] T. Dey, W. Wang, J. W. Davidson, and M. L. Soffa. Characterizing multi-threaded applications based on shared-resource contention. In Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium on, pages 76 –86, april 2011. [10] A. S. Dhodapkar and J. E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In Proceedings of the 29th annual international symposium on Computer architecture, ISCA ’02, pages 233–244, Washington, DC, USA, 2002. IEEE Computer Society. [11] E. Duesterwald, C. Cascaval, and S. Dwarkadas. Characterizing and predicting program behavior and its variability. In Parallel Architectures and Compilation Techniques, 2003. PACT 2003. Proceedings. 12th International Conference on, pages 220 – 231, sept.-1 oct. 2003. [12] S. Eranian. Perfmon2: A flexible performance monitoring interface for linux. [13] D. Gu and C. Verbrugge. Using hardware data to detect repetitive program behavior. 2007. [14] T. Huffmire and T. Sherwood. Wavelet-based phase classification. In Proceedings of the 15th international conference on Parallel architectures and compilation techniques, PACT ’06, pages 95–104, New York, NY, USA, 2006. ACM. [15] C. Isci. Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques. [16] C. Isci and M. Martonosi. Identifying program power phase behavior using power vectors. In Workload Characterization, 2003. WWC-6. 2003 IEEE International Workshop on, pages 108 – 118, oct. 2003. [17] J. Lau, E. Perelman, G. Hamerly, T. Sherwood, and B. Calder. Motivation for variable length intervals and hierarchical phase behavior. In Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on, pages 135 –146, march 2005. [18] J. Lau, J. Sampson, E. Perelman, G. Hamerly, and B. Calder. The strong correlation between code signatures and performance. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, pages 236–247, Washington, DC, USA, 2005. IEEE Computer Society. [19] J. Lau, S. Schoemackers, and B. Calder. Structures for phase classification. In Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, pages 57–67, Washington, DC, USA, 2004. IEEE Computer Society. [20] C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, PLDI ’05, pages 190–200, New York, NY, USA, 2005. ACM. [21] H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large intel itanium programs with dynamic instrumentation. In Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 37, pages 81–92, Washington, DC, USA, 2004. IEEE Computer Society. [22] E. Perelman, G. Hamerly, and B. Calder. Picking statistically valid and early simulation points. In Parallel Architectures and Compilation Techniques, 2003. PACT 2003. Proceedings. 12th International Conference on, pages 244 – 255, sept.-1 oct. 2003. [23] E. Perelman, M. Polito, J. yves Bouguet, J. Sampson, B. Calder, and C. Dulong. Detecting phases in parallel applications on shared memory architectures. In In International Parallel and Distributed Processing Symposium, pages 25–29, 2006. [24] T. Sherwood, Calder, and B. Calder. Time varying behavior of programs. Technical report, 1999. [25] T. Sherwood, E. Perelman, and B. Calder. Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, PACT ’01, pages 3–14, Washington, DC, USA, 2001. IEEE Computer Society. [26] T. Sherwood, E. Perelman, and G. Hamerly. Automatically characterizing large scale program behavior. pages 45–57, 2002. [27] J. Smith and A. Dhodapkar. Dynamic microarchitecture adaptation via co-designed virtual machines. In Solid-State Circuits Conference, 2002. Digest of Technical Papers. ISSCC. 2002 IEEE International, volume 2, pages 154 –444, 2002. [28] M. Tawk, K. Ibrahim, and S. Niar. Adaptive sampling for efficient mpsoc architecture simulation. In Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2007. MASCOTS ’07. 15th International Symposium on, pages 186 –192, oct. 2007. [29] M. Tawk, K. Z. Ibrahim, and S. Niar. Multi-granularity sampling for simulating concurrent heterogeneous applications. In Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems, CASES ’08, pages 217–226, New York, NY, USA, 2008. ACM. [30] M. Van Biesbrouck, T. Sherwood, and B. Calder. A co-phase matrix to guide simultaneous multithreading simulation. In Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, pages 45–56, Washington, DC, USA, 2004. IEEE Computer Society. [31] G. M. J. K. Yu Zhang, Berkin Ozisikyilmaz and A. Choudhary. Analyzing the impact of on-chip network traffic on program phases for cmps. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on, pages 218 –226, 2009. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38073 | - |
| dc.description.abstract | 週期精確的架構模擬在計算機結構的研究裡扮演了重要的角色。它允許計算機架構研究者在不同的計算機結構中評估程式的行為及效能表現。然而,週期精確軟體模擬器,其模擬速度相當緩慢,完整模擬一個單核心程式就需花費數週至數月的時間。隨著多核心處理器的普及化,模擬多核心處理器架構需同時模擬許多處理器及其周邊架構,其模擬速度更加緩慢,所以改善多核心系統模擬器的效能變得極為重要。
過去的研究中已知程式裡存在重複出現的程式行為,這些重複出現的行為能被利用來減少程式模擬的時間。取樣模擬是一種利用程式中重複行為來減少程式模擬時間的技術,它把程式中相似行為的區間聚集成相位,相位是由相似行為的程式區間組成的集合,取樣模擬藉由只週期精確模擬程式中每一個相位裡具有代表性的區間來減少程式模擬的時間,這些具有代表性的區間稱為程式的特徵點。傳統在單一執行緒的程式中,程式碼簽章被利用來偵測程式中的相位。然而對於平行程式,程式的行為並不只受執行的指令影響,執行緒間的互動也會影響程式的效能行為表現。 在本論文中,我們深入了解取樣模擬的技術,並基於此技術提出一個能加速多核心系統模擬的機制。我們提出了一個能紀錄執行緒間共享資源競爭情形且獨立於計算機架構的相位偵測方法。透過只採計特徵點的程式行為與程式完整執行的實驗結果比較,我們設計的方法能有效幫助傳統的程式碼簽章方法,在三台不同架構的實驗機器上,其平均錯誤率相較程式碼簽章的方法減少了22%,而平均錯誤率為2.61%。 我們還開發了一個實用的工具,AACS (Application Annotation and Characterization System)。AACS提供了快速程式行為觀察的功能,它能幫助使用者了解程式在多核心架構下之行為,而且它能自動利用我們提出的方法產生出使用者選定之目標程式的特徵點,此特徵點能被利用來加速多執行緒程式在多核心處理器系統之模擬。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2021-06-13T16:26:08Z (GMT). No. of bitstreams: 1 ntu-100-R98922108-1.pdf: 1906597 bytes, checksum: 3c0618284feff9ffa243399021bcc50f (MD5) Previous issue date: 2011 | en |
| dc.description.tableofcontents | Abstract i
1 Introduction 1 2 Related Works 5 3 Profile-Based Simulation Sampling Techniques 9 3.1 Profile-based Phase Detection Technique for Serial Applications 9 3.2 Profile-based Phase Detection Technique for Multiprogramming Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 Profile-based Phase Detection Technique for Multi-threaded Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Frequency Vectors Target For Multi-Threaded Applications 16 4.1 The Interactions Between Threads In Multi-Threaded Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Memory Foorprint Vectors (MFV) . . . . . . . . . . . . . . . . 19 5 An Automatic Phase Detection Toolset For Multi-Threaded Applications 21 5.1 Monitoring Subsystem: Program Behavior Characterization . 22 5.1.1 Preparation and Sampling . . . . . . . . . . . . . . . . 22 5.1.2 Visualization of the Performance Metrics . . . . . . . . 22 5.2 Classification Subsystem: Automatic Phase Detection . . . . . 25 6 Experiments 28 6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 29 6.1.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . 29 6.1.2 Experimental Machines . . . . . . . . . . . . . . . . . . 30 6.1.3 Performance Metrics for Evaluating Phase Detection Technique . . . . . . . . . . . . . . . . . . . . . . . . . 31 6.2 Experimental Results of Proposed Techniques . . . . . . . . . 32 6.2.1 IPC Error Rate . . . . . . . . . . . . . . . . . . . . . . 32 7 Conclusion 39 Bibliography 41 | |
| dc.language.iso | en | |
| dc.subject | 特徵驅動取樣機制 | zh_TW |
| dc.subject | 多核心模擬 | zh_TW |
| dc.subject | 取樣模擬 | zh_TW |
| dc.subject | 相位偵測 | zh_TW |
| dc.subject | simulation sampling | en |
| dc.subject | le-based sampling | en |
| dc.subject | profi | en |
| dc.subject | phase detection | en |
| dc.subject | chip multiprocessor simulation | en |
| dc.title | 多執行緒程式在多核處理器之自動化相位偵測與分類工具 | zh_TW |
| dc.title | A Toolset for Automatic Phase Detection and Characterization for Multi-threaded Applications on Chip Multiprocessors | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 99-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 洪士灝,施吉昇 | |
| dc.subject.keyword | 多核心模擬,特徵驅動取樣機制,取樣模擬,相位偵測, | zh_TW |
| dc.subject.keyword | chip multiprocessor simulation,profi,le-based sampling,simulation sampling,phase detection, | en |
| dc.relation.page | 46 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2011-08-20 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-100-1.pdf 未授權公開取用 | 1.86 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
