Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8582
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉邦鋒
dc.contributor.authorChing-Chi Linen
dc.contributor.author林敬棋zh_TW
dc.date.accessioned2021-05-20T19:58:37Z-
dc.date.available2012-07-20
dc.date.available2021-05-20T19:58:37Z-
dc.date.copyright2010-07-20
dc.date.issued2010
dc.date.submitted2010-07-12
dc.identifier.citation[1] V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. SIGPLAN Not., 35(5):1–12, 2000.
[2] D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In CGO ’03: Proceedings of the international symposium on Code generation and optimization, pages 265–275, Washington, DC, USA, 2003. IEEE Computer Society.
[3] J. Lu, H. Chen, P.C. Yew, and W.C. Hsu. Design and implementation of a lightweight dynamic optimization system. Journal of Instruction-Level Parallelism, 6, April 2004.
[4] W.K Chen, S. Lerner, R. Chaiken, and D.M. Gillies. Mojo: A dynamic optimization system. In Proceedings of the 3rd ACM Workshop on Feedback-Directed and Dynamic Optimization, 2000.
[5] C. Pereira, J. Lau, B. Calder, and R. Gupta. Dynamic phase analysis for cycle-close trace generation. In CODES+ISSS ’05: Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, pages 321–326, New York, NY, USA, 2005. ACM.
[6] A.S. Dhodapkar and J.E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In ISCA ’02: Proceedings of the 29th annual international symposium on Computer architecture, pages 233–244, Washington, DC, USA, 2002. IEEE Computer Society.
[7] T. Kistler and M. Franz. Continuous program optimization: A case study. ACM Trans. Program. Lang. Syst., 25(4):500–548, 2003.
[8] G. Hamerly, E. Perelman, and B. Calderd. How to use simpoint to pick simulation points. SIGMETRICS Perform. Eval. Rev., 31(4):25–30, 2004.
[9] A. Das, J. Lu, and W.C. Hsu. Region monitoring for local phase detection in dynamic optimization systems. In CGO ’06: Proceedings of the International Symposium on Code Generation and Optimization, pages 124–134, Washington, DC, USA, 2006. IEEE Computer Society.
[10] P. Nagpurkar, C. Krintz, M. Hind, P.F. Sweeney, and V.T. Rajan. Online phase detection algorithms. In CGO ’06: Proceedings of the International Symposium on Code Generation and Optimization, pages 111–123, Washington, DC, USA, 2006. IEEE Computer Society.
[11] G. Fursin, A. Cohen, M. O’Boyle, and O. Temam. A practical method for quickly evaluating program optimizations. In Proceedings of the 1st International Conference on High Performance Embedded Architectures and Compilers, pages 29–46, November 2005.
[12] J. Lau,M. Arnold, M. Hind, and B. Calder. Online performance auditing: using hot optimizations without getting burned. In PLDI ’06: Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, pages 239–251, New York, NY, USA, 2006. ACM.
[13] C. Dubach, J. Cavazos, B. Franke, G. Fursin, M.F.P. O’Boyle, and O. Temam. Fast compiler optimisation evaluation using code-feature based performance prediction. In CF ’07: Proceedings of the 4th international conference on Computing frontiers, pages 131–142, New York, NY, USA, 2007. ACM.
[14] M. Annavaram, R. Rakvic, M. Polito, J.Y. Bouguet, R.A. Hankins, and B. Davies. The fuzzy correlation between code and performance predictability. In Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, pages 93–104. IEEE Computer Society, 2004.
[15] J. Lau, J. Sampson, E. Perelman, G. Hamerly, and B. Calder. The strong correlation between code signatures and performance. In ISPASS ’05: Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, pages 236–247, Washington, DC, USA, 2005. IEEE Computer Society.
[16] Y. Jiang, E.Z. Zhang, K. Tian, F.Mao, M. Gethers, X. Shen, and T. Gao. Exploiting statistical correlations for proactive prediction of program behaviors. In CGO ’10: Proceedings of the 8th annual IEEE/ACMinternational symposium on Code generation and optimization, pages 248–256, New York, NY, USA, 2010. ACM.
[17] T. Sherwood, E. Perelman, and B. Calder. Basic block distribution analysis to find periodic behavior and simulation points in applications. In PACT ’01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, pages 3–14, Washington, DC, USA, 2001. IEEE Computer Society.
[18] T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, pages 45–57, New York, NY, USA, 2002. ACM.
[19] T. Sherwood, E. Perelman, G. Hamerly, S. Sair, and B. Calder. Discovering and exploiting program phases. IEEE Micro, 23(6):84–93, 2003.
[20] T. Sherwood, S. Sair, and B. Calder. Phase tracking and prediction. In ISCA ’03: Proceedings of the 30th annual international symposium on Computer architecture, pages 336–349, New York, NY, USA, 2003. ACM.
[21] M.C. Merten, A.R. Trick, C.N. George, J.C. Gyllenhaal, and W.W. Hwu. A hardware-driven profiling scheme for identifying program hot spots to support runtime optimization. In ISCA ’99: Proceedings of the 26th annual international symposium on Computer architecture, pages 136–147, Washington, DC, USA, 1999. IEEE Computer Society.
[22] B. Davies, J. Bouguet, M. Polito, and M. Annavaram. iPART: An Automated Phase Analysis and Recognition Tool. Technical Report IR-TR-2004-1-iPART, Intel Corporation, February 2004. ftp://download.intel.com/research/library/IR-TR-2004-1-iPART.pdf.
[23] E. Perelman, M. Polito, J.Y. Bouguet, J. Sampson, B. Calder, and C. Dulong. Detecting phases in parallel applications on shared memory architectures. In IPDPS ’06: Proceedings of the 20th International Parallel and Distributed Processing Symposium, April 2006.
[24] G. Hamerly and C. Elkan. Alternatives to the k-means algorithm that find better clusterings. In CIKM ’02: Proceedings of the eleventh international conference on Information and knowledge management, pages 600–607, New York, NY, USA, 2002. ACM.
[25] D. Pelleg and A.W. Moore. X-means: Extending k-means with efficient estimation of the number of clusters. In ICML ’00: Proceedings of the Seventeenth International Conference on Machine Learning, pages 727–734, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
[26] C.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI ’05: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pages 190–200, New York, NY, USA, 2005. ACM.
[27] Perfmon. http://perfmon2.sourceforge.net/.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8582-
dc.description.abstract程式最佳化主要可以分爲兩種方式-靜態最佳化以及動態最佳化,這兩種最佳化方式在處理單核心系統架構中的程式都有不錯的表現。但是在多核心系統新架構下,兩種最佳化方式在找尋程式中的最佳化機會時,都沒有將多執行緒程式執行緒間的互動列入考慮。我們的目的是要發展出一個能夠辨認出執行緒間互動的技術,並且利用這些資訊來幫助程式最佳化。經由觀察發現,執行緒間的互動像是競爭公用快取記憶體,可能導致“不穩定”的程式行爲。也就是說,執行程式中相同的程式碼片段,理論上會有相同或相似的效能,但實際上卻有很大的差異, 就會被稱作“不穩定”。我們將這些不穩定的程式片段視爲“最佳化的機會”, 希望能夠藉由最佳化這些片段,使它們恢穩定,進而提升執行效能。
我們提出了一個簡單而且有效率的方法,藉由取樣以及分析基本區塊的效能變化,來分辨出哪些基本區塊是“穩定”,而哪些是“不穩定”。分析得到的結果能夠讓動態最佳化器用來分辨在程式執行的過程中,哪些基本區塊是不穩定的,需要特別注意或特殊處理。我們可以藉由將取樣到的指令指標對應回它所屬的基本區塊,進而找到出現次數最多的基本區塊,用來當作程式執行區間的代表。藉此可以比較不同執行緒中有相同代表的程式區段的效能,計算出每個區段的效能變化。這個方法也能夠應用在單一執行緒的程式。
zh_TW
dc.description.abstractThere have two groups of works in optimizing program execution in the literature – static and dynamic program optimization. To our best knowledge, neither of these optimizations, while looking for optimization opportunities, considers interactions among threads in multi-core architecture. Therefore we would like to develop techniques that can identify the presence of thread interactions and use it to guide possible optimization. We observe that interaction among threads, like competition for shared cache, can lead to “unstable” execution performance. That is, the same part of program will have very different performance characteristics, therefore we identify those parts of program as dynamic optimization opportunity, so that they can be optimized for better performance.
We propose a simple and efficient sampling method that analyzes performance variance among basic blocks, so as to differentiate “unstable” and “stable” basic blocks. The results from the analysis can be used as a reference to determine which parts of the program on which dynamic optimizer should make extra efforts during execution. By mapping EIP of each sample back to its basic block, we are able to choose representative basic block for each interval during execution, and compare the performance of each thread, so as to calculate the performance variance of each basic block. This sampling technique can also be applied to single-threaded programs.
en
dc.description.provenanceMade available in DSpace on 2021-05-20T19:58:37Z (GMT). No. of bitstreams: 1
ntu-99-R97922128-1.pdf: 637470 bytes, checksum: 6839348f488b7ab5e552b1b09ff1716e (MD5)
Previous issue date: 2010
en
dc.description.tableofcontentsCertification i
Acknowledgement ii
Chinese Abstract iii
Abstract iv
1 Introduction 1
2 Related Works 3
3 Choosing Representative 5
3.1 Average EIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Methodology 9
4.1 Sampling and Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Representative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 Optimization Opportunity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Experimental Results 11
5.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.2 Classification Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.3 Performance Comparison Results . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5.3.1 Optimization Level: O0 vs O2 . . . . . . . . . . . . . . . . . . . . . . . 15
5.3.2 Unstable parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.4 Case Study: Swim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.5 Summary of Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 Conclusion 19
Bibliography 20
dc.language.isoen
dc.title尋找多核心系統架構程式最佳化機會之新方法zh_TW
dc.titleA New Prospect in Finding Optimization Opportunities for Multicore Architecturesen
dc.typeThesis
dc.date.schoolyear98-2
dc.description.degree碩士
dc.contributor.coadvisor游本中
dc.contributor.oralexamcommittee吳真貞,徐慰中,陳添福
dc.subject.keyword多核心架構,動態最佳化,程式行&#29234,zh_TW
dc.subject.keywordMulti-core architecture,Dynamic optimization,Program behavior,en
dc.relation.page22
dc.rights.note同意授權(全球公開)
dc.date.accepted2010-07-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-99-1.pdf622.53 kBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved