基於多核心及異質記憶體模組之即時程序排程

Che-Wei Chang; 張哲維

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66003

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	郭大維
dc.contributor.author	Che-Wei Chang	en
dc.contributor.author	張哲維	zh_TW
dc.date.accessioned	2021-06-17T00:18:38Z	-
dc.date.available	2017-06-26
dc.date.copyright	2012-08-28
dc.date.issued	2012
dc.date.submitted	2012-06-28
dc.identifier.citation	[1] Chrom OS. http://googleblog.blogspot.com/2009/07/introducing-google-chrome-os.html/, 2012. [2] Chromium OS. http://sites.google.com/a/chromium.org/dev/chromium-os/, 2012. [3] Malardalen WCET Research Group. WCET benchmarks. http://www.mrtc.mdh.se/projects/wcet/, 2012. [4] Moblin. http://moblin.org/, 2012. [5] Ubuntu. http://www.canonical.com/projects/ubuntu/, 2012. [6] UTDSP Benchmark Suite. http://www.eecg.toronto.edu/∼corinna/DSP/infrastructure/UTDSP.tar.gz, 2012. [7] Windows Mobile. http://www.microsoft.com/windowsmobile/, 2012. [8] AbsInt Angewandte Informatik GmbH. aiT: worst-case execution time analyzers. http://www.absint.com/ait, 2012. [9] Bjorn Andersson and Eduardo Tovar. Multiprocessor scheduling with few preemptions. In RTAS, 2006. [10] Bjorn Anderssont, Sanjoy Baruaht, and Jan Jonssont. Static-priority scheduling on multiprocessors. In RTSS, 2001. [11] Oren Avissar, Rajeev Barua, and Dave Stewart. Heterogeneous memory management for embedded systems. In CASES, 2001. [12] Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In CODES, 2002. [13] Nikhil Bansal, Alberto Caprara, and Maxim Sviridenko. Improved approximation algorithms for multidimensional bin packing problems. In FOCS, 2006. [14] Sanjoy Baruah. Partitioning sporadic task systems upon memory-constrained multiprocessors. to appear in ACM Transactions in Embedded Computing Systems. [15] Jian-Jia Chen and Samarjit Chakraborty. Resource augmentation bounds for approximate demand bound functions. In RTSS, pages 272–281, 2011. [16] Kyung Ho Chung, Myung Sil Choi, and Kwang Seon Ahn. A study on the packaging for fast boot-up time in the embedded linux. In IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, pages 89–94, 2007. [17] Jonathan Corbet. AXFS: a compressed, execute-in-place filesystem. Technical report, 2008. [18] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, 3 edition, 2009. [19] Robert I. Davis and Alan Burns. A survey of hard real-time scheduling for multiprocessor systems. ACM Computing Surveys, 43(4), October 2011. [20] Wenceslas Fernandez de la Vega and George S. Lueker. Bin packing can be solved within 1+epsilon in linear time. Combinatorica, 1(4):349–355, 1981. [21] Swamava Dey and Ranjan Dasgupta. Fast boot user experience using adaptive storage partitioning. In Computation World: Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns, pages 113–118, November 2009. [22] Robert P. Dick, David L. Rhodes, and Wayne Wolf. Tgff: Task graphs for free. In Proceedings of International Wrokshop on Hardware/Software Codesign, pages 97–101, March 1998. [23] Gyorgy Dosa. The tight bound of first fit decreasing bin-packing algorithm is FFD(I) ≤ 11/9 OPT(I) + 6/9 . Lecture Notes in Computer Science, 4614:1–11, 2007. [24] DRAMeXchange. The contract prices of DRAM and Flash, March 2010. [25] Saber M. Elsayed, Ruhul A. Sarker, and Daryl L. Essam. An improved self-adaptive differential evolution algorithm for optimization problems. to appear in the IEEE Transactions on Industrial Informatics. [26] Tullio Facchinetti and Marco L. Della Vedova. Real-time modeling for direct load control in cyber-physical power systems. IEEE Transactions on Industrial Informatics, 7(4):689–698, 2011. [27] Heiko Falk and Jan C. Kleinsorge. Optimal static WCET-aware scratchpad allocation of program code. In DAC, 2009. [28] Nathan Fisher, Sanjoy Baruah, and Theodore P. Baker. The partitioned scheduling of sporadic tasks according to static-priorities. In EuroMicro, 2006. [29] Michael R. Garey and David S. Johnson. Computers and intractability: A guide to the theory of NP-completeness. W. H. Freeman and Co., 1979. [30] Ronald L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics, 17(2):416–429, 1969. [31] Jianjun Guo, Mingche Lai, Zhengyuan Pang, Libo Huang, Fangyuan Chen, Kui Dai, and Zhiying Wang. Hierarchical memory system design for a heterogeneous multicore processor. In SAC, 2008. [32] Yibo Guo, Qingfeng Zhuge, Jingtong Hu, Meikang Qiu, and Edwin H.-M. Sha. Optimal data allocation for scratch-pad memory on embedded multi-core systems. In ICPP, pages 464–471, 2011. [33] Trevor Harmon, Martin Schoeberl, Raimund Kirner, Raymond Klefstad, Kwang H. Kim, and Michael R. Lowry. Fast, interactive worst-case execution time analysis with back-annotation. IEEE Transactions on Industrial Informatics, 8(2):366–377, 2012. [34] Yongsoo Joo, Yongseok Choi, Chanik Park, Sung Woo Chung, Eui-Young Chung, and Naehyuck Chang. Demand paging for onenand flash execute-in-place. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, October 2006. [35] James A. Kahle, Michael N. Day, H. Peter Hofstee, Charles R. Johns, Theodore R. Maeurer, and David J. Shippy. Introduction to the Cell multiprocessor. IBM Journal of Research and Development, 49(4.5):589–604, 2005. [36] Bala Kalyanasundaram and Kirk Pruhs. Speed is as powerful as clairvoyance. Journal of the ACM, 47(4):617–643, 2000. [37] Hiroki Kaminaga. Improving linux startup time using software resume. In Linux Symposium, pages 17–26, July 2006. [38] Satoshi Kaneko, Hiroyuki Kondo, Norio Masui, Koichi Ishimi, Teruyuki Itou, Masayuki Satou, Naoto Okumura, Yukari Takata, Hirokazu Takata, Mamoru Sakugawa, Takashi Higuchi, Sugako Ohtani, Kei Sakamoto, Naoshi Ishikawa, Masami Nakajima, Shunichi Iwata, Kiyoshi Hayase, Satoshi Nakano, Sachiko Nakazawa, Kunihiro Yamada, and Toru Shimizu. A 600-MHz single-chip multiprocessor with 4.8-GB/s internal shared pipelined bus and 512-kB internal memory. IEEE Journal of Solid-State Circuits, 39(1):589–604, 2004. [39] Narendra Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorica, 4(4):373–395, 1984. [40] Ashok Khemka and Rudrapatna K. Shyamasundar. An optimal multiprocessor real-time scheduling algorithm. Journal of Parallel and Distributed Computing, 43(1):37–45, 1997. [41] AJ KleinOsowski, John Flynn, Nancy Meares, and David J. Lilja. Adapting the SPEC 2000 benchmark suite for simulation-based computer architecture research. The Kluwer International Series in Engineering and Computer Science, 610:83–100, 2002. [42] Jesse D. Kornblum. Exploiting the rootkit paradox with windows memory analysis. In International Journal of Digital Evidence, volume 5, 2006. [43] Jean Labrosse. uC/OS-III, The Real-Time Kernel. 2009. [44] Cheol-Hoon Lee and Kang G. Shin. On-line dynamic voltage scaling for hard realtime systems using the edf algorithm. In RTSS, 2004. [45] Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. Mediabench: A tool for evaluating and synthesizing multimedia and communications systems. In MICRO, 1997. [46] Hennadiy Leontyev and James H. Anderson. A hierarchical multiprocessor bandwidth reservation scheme with timing guarantees. Real-Time Systems, 43, 2009. [47] Chung-Laung Liu and JamesW. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM, 20:46–61, 1973. [48] Shi-Wu Lo, Wei-Shiuan Tsai, Jeng-Gang Lin, and Guan-Shiung Cheng. Swap before-hibernate: a time efficient method to suspend an os to a flash drive. In ACM Symposium on Applied Computing, pages 201–205, March 2010. [49] Silvano Martello and Paolo Toth. Knapsack problems: Algorithms and Computer Implementations. John Wiley & Sons, Inc., 1990. [50] Avantika Mathur, Mingming Cao, and Suparna Bhattacharya. The new ext4 filesystem: current status and future plans. In Linux Symposium, pages 21–34, June 2007. [51] Gokhan Memik, William H. Mangione-Smith, and Wendong Hu. Netbench: A benchmarking suite for network processors. In ICCAD, 2001. [52] Microsoft. Windows readydrive and hybrid hard disk drives. Technical report, May 2006. [53] Patrick Mochel. Linux kernel power management. In Linux Symposium, pages 325–339, July 2003. [54] Aloysius K. Mok. Fundamental design problems of distributed systems for the hard real-time environment. Technical report, Cambridge, MA, USA, 1983. [55] Martin Niemeier, AndreasWiese, and Sanjoy Baruah. Partitioned real-time scheduling on heterogeneous shared-memory multiprocessors. In ECRTS, 2011. [56] Numonyx. M29DW127G NOR Flash Memory, May 2009. [57] Ozcan Ozturk, Mahmut Kandemir, and Ibrahim Kolcu. Shared Scratch-Pad memory space management. In ISQED, 2006. [58] Chanik Park, Jaeyu Seo, Sunghwan Bae, Hyojun Kim, Shinhan Kim, and Bumsoo Kim. A low-cost memory architecture with nand xip for mobile embedded systems. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, pages 138–143, October 2003. [59] Chanju Park, Kyuhyung Kim, Youngjun Jang, and Kyungju Hyun. Linux bootup time reduction for digital still camera. In Linux Symposium, pages 231–240, July 2006. [60] Thomas Rothvoss. On the computational complexity of periodic scheduling. Ph.D. Thesis, 2009. [61] Samsung Electronics. 1Gb E-die DDR2 SDRAM Specification, December 2008. [62] Seung Woo Son, Sai Prashanth Muralidhara, Ozcan Ozturk, Mahmut Kandemir, Ibrahim Kolcu, and Mustafa Karakoy. Profiler and compiler assisted adaptive I/O prefetching for shared storage caches. In Parallel Architectures and Compilation Techniques, pages 112–121, July 2008. [63] SPECTEC. FxxL63B NAND Flash Memory, September 2008. [64] Vivy Suhendra, Tulika Mitra, Abhik Roychoudhury, and Ting Chen. WCET centric data allocation to scratchpad memory. In RTSS, pages 223–232, 2005. [65] Hideki Takase, Hiroyuki Tomiyama, and Hiroaki Takada. Partitioning and allocation of scratch-pad memory for priority-based preemptive multi-task systems. In DATE, pages 1124–1129, 2010. [66] Texas Instruments. Davinci-dm644x evaluation module technical reference, March 2007. [67] Henrik Theiling, Christian Ferdinand, and Reinhard Wilhelm. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Syst., 18:157–179, May 2000. [68] Masao Uebayashi. eXecute-In-Place (XIP) support for NetBSD. Technical report, Tombi Incorporation, Apr 2010. [69] Rob F. van der Wijngaar, Timothy G. Mattson, and Werner Haas. Light-weight communications on intel’s single-chip cloud computer processor. SIGOPS Operating Systems Review, 45:73–83, February 2011. [70] Vijay V. Vazirani. Approximation Algorithms. Springer, 2001. [71] SorenWellhofer. Application eXecute-In-Place (XIP) with Linux and AXFS. Technical report, September 2009. [72] Gerhard J.Woeginger. There is no asymptotic ptas for two-dimensional vector packing. Information Processing Letters, 64(6):293–294, 1997. [73] Lei Zhang, Meikang Qiu,Wei-Che Tseng, and Edwin H.-M. Sha. Variable partitioning and scheduling for MPSoC with virtually shared scratch pad memory. Journal of Signal Processing Systems, 58:247–265, 2010. [74] Qi Zhu, Yang Yang, Marco Di Natale, Eelco Scholte, and Alberto Sangiovanni-Vincentelli. Optimizing the software architecture for extensibility in hard real-time distributed systems. IEEE Transactions on Industrial Informatics, 6(4):621–636, 2010. [75] Vojin Zivojnovic, Juan Martinez Velarde, Christian Schlager, and Heinrich Meyr. DSPstone: A DSP-oriented benchmarking methodology. In ICSPAT, 1994.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66003	-
dc.description.abstract	多核心及異質記憶體之設計已廣泛的使用於伺服器、個人電腦甚至嵌入式系統，隨著硬體技術之成長，軟體的開發益趨於多樣化且複雜，這也加重了系統軟體之需求與挑戰，甚者，由於多核心排程技術須更進一步考量異質記憶體之使用，傳統即時多核心排程技術將使得系統使用效率低落。基於以上之觀察，本論文利用非揮發性記憶體開發出一快速開機機制，減少開機時載入及初始化之冗長時間，同時並權衡考量系統執行時期之效能，以有限的執行時期之效能損失為前提，最小化系統開機之時間。進一步在系統之執行時期本論文提出處理器與記憶體共管理之演算法，此演算法包含異質記憶體管理機制以及對應之多核心工作排程技術，以期達成系統即時效能之需求。然而當系統核心數量不斷上升時，記憶體架構勢必益趨複雜，本於此一考量，本論文延伸探討多核心及異質記憶體之共同資源管理機制，於可重新結構配置之多核心硬體環境之下，所提出之機制將可最小化系統資源之消耗並達成軟體之即時效能需求。最後，透過一連串的實驗以及模擬來評估演算法之性能，並展示所提出設計之可行性。	zh_TW
dc.description.abstract	Multiple cores and heterogeneous memory components are widely adopted in the designs of servers, personal computers, and even embedded systems. Such trends have imposed great pressure on better and more support from the operating systems and even imply more demands in the system efficiency enhancement, including booting time, for many systems. The success of the system efficiency enhancement relies on not only task scheduling over cores but also allocation strategies of other resources. Such an observation motivates the joint consideration of task scheduling and the memory allocation of task images. This dissertation first shows that the booting time minimization problem for real-time systems with nonvolatile memory is NP-hard. An optimal but pseudo-polynomial-time algorithm, a 0.25-approximation algorithm, and a polynomial-time approximation scheme are proposed for the tradeoff between the solution quality and the time complexity. In the second part of this dissertation, the run-time performance optimization problem for multi-core systems equipped with heterogeneous shared memory modules is considered. A two-phase algorithm is proposed to minimize the maximum required utilization among cores, where memory modules have different access latencies: The first phase determines the memory allocation, and the second phase assigns tasks onto cores with a tight (2− 2M+1)-approximation bound, where M is the number of cores. In the third part of this dissertation, the results are extended for island-based multi-core platforms with an observation that the number of cores in a system keeps increasing. This dissertation explores real-time task scheduling over homogeneous cores with local memory shared by the cores in an island and global memory shared by the entire system. Approximation algorithms are provided to efficiently derive the task partition and the memory allocation. Case studies and simulations were conducted to evaluate the proposed algorithms, and the results further revealed the insights into the joint-consideration designs of memory allocation and task scheduling.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T00:18:38Z (GMT). No. of bitstreams: 1 ntu-101-F95922093-1.pdf: 1060457 bytes, checksum: 44b50b760c3dae65c376243d0a82d21b (MD5) Previous issue date: 2012	en
dc.description.tableofcontents	Abstract in Chinese v Abstract vii Acknowledgment ix Contents xi List of Figures xiv List of Tables xv 1 Introduction 1 1.1 Introduction 1 1.2 Related Work 4 1.2.1 Fast Booting 4 1.2.2 Multi-Core Task Scheduling 6 1.3 Organization 8 2 A Fast Booting Strategy with Nonvolatile Memory 10 2.1 Introduction 10 2.2 System Model and Problem Definition 11 2.2.1 XIP Architecture and Real-Time Task Model 12 2.2.2 Problem Definition 14 2.3 Algorithms for the Booting-Time Minimization 18 2.3.1 An Optimal Algorithm - A Dynamic Programming Approach 18 2.3.2 A 0.25-Approximation Algorithm 21 2.3.3 A Polynomial-Time Approximation Scheme 24 2.4 Performance Evaluation 32 2.4.1 A Case Study for Our Algorithms 32 2.4.2 Extensive Simulations 36 2.5 Summary 38 3 Partitioned Scheduling with Heterogeneous Memory Modules 40 3.1 Introduction 40 3.2 System Architecture and Problem Definition 42 3.2.1 Platform and Task Models 42 3.2.2 Problem Definition 43 3.3 Overview of Our Approach 45 3.4 Memory Allocation Strategies 46 3.4.1 WCET Mapping Tables 47 3.4.2 Memory Allocation: Special Case - WID 49 3.4.3 Memory Allocation: General Cases 52 3.5 Task Partition 55 3.6 Resource Augmentation 59 3.7 Performance Evaluation 60 3.7.1 A Case Study for General Case Memory Allocation 60 3.7.2 A Case Study for Special Case Memory Allocation with the WID Property 65 3.8 Summary 68 4 An Island-Based Multi-Core System Design 70 4.1 Introduction 70 4.2 System Model and Problem Definition 72 4.2.1 System Model 72 4.2.2 Problem Definition 74 4.3 Our Algorithms for the IBRT Problem 76 4.3.1 Lower Bound for An Input Instance 77 4.3.2 An Algorithm for Single-Core Islands 78 4.3.3 An Algorithm for Multi-Core Islands 80 4.4 Our Scheme for the IBRTC Problem 84 4.5 Performance Evaluation 88 4.5.1 Environment Setup 89 4.5.2 Evaluation Results 90 4.6 Summary 93 5 Concluding Remarks 94 5.1 Conclusion 94 5.2 Future Work 95 Bibliography 97 Curriculum Vitae 105 Publication List 106
dc.language.iso	zh-TW
dc.subject	多核心	zh_TW
dc.subject	異質記憶體	zh_TW
dc.subject	即時系統	zh_TW
dc.subject	快速開機	zh_TW
dc.subject	即時排程	zh_TW
dc.subject	fast booting	en
dc.subject	real-time system	en
dc.subject	multi-core	en
dc.subject	heterogeneous memory	en
dc.subject	real-time scheduling	en
dc.title	基於多核心及異質記憶體模組之即時程序排程	zh_TW
dc.title	Real-Time Task Scheduling over Multiple Cores with Heterogeneous Memory Modules	en
dc.type	Thesis
dc.date.schoolyear	100-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	郭耀煌,薛智文,劉邦鋒,陳建佳,謝錫?
dc.subject.keyword	多核心,異質記憶體,即時系統,快速開機,即時排程,	zh_TW
dc.subject.keyword	multi-core,heterogeneous memory,real-time system,fast booting,real-time scheduling,	en
dc.relation.page	107
dc.rights.note	有償授權
dc.date.accepted	2012-06-28
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf 未授權公開取用	1.04 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。