Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43866
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor薛智文
dc.contributor.authorTang-Hsun Tuen
dc.contributor.author涂堂訓zh_TW
dc.date.accessioned2021-06-15T02:31:01Z-
dc.date.available2010-08-19
dc.date.copyright2009-08-19
dc.date.issued2009
dc.date.submitted2009-08-14
dc.identifier.citation[1] Intel (R) VTune(TM) Performance Analyzer. http://www.intel.com/cd/software/products/asmo-na/eng/239144.htm.
[2] POSIX Threads (pthreads) for Win32. http://sourceware.org/pthreads-win32, 2005.
[3] GNU C Library. http://www.gnu.org/software/libc, Nov. 2008.
[4] H.264/AVC JM Reference Software. http://iphome.hhi.de/suehring/tml, Jan. 2009.
[5] Loadable Kernel Module - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Loadable kernel module, Mar. 2009.
[6] Man page - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Man page, May 2009.
[7] procfs - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Procfs, Mar. 2009.
[8] Sysfs - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Sysfs, Mar. 2009.
[9] Yen-Kuang Chen, X Tian, Steven Ge, and M. Girkar. Towards Efficient Multi-Level Threading of H.264 Encoder on Intel Hyper-Threading Architectures. IEEE Proceedings of the 18th International Parallel and Distributed Processing Symposium, page 63, 2004.
[10] Corbet. Scheduling domains [LWN.net]. http://lwn.net/Articles/80911/, Apr. 2004.
[11] L Dagum and R Menon. OpenMP: An Industry-Standard API for Shared Memory Programming. IEEE Computational Science & Engineering, 5(1):46–55, Jan. 1998.
[12] RL Graham. Bounds on Multiprocessing Timing Anomalies. SIAM Journal of Applied Mathematics, 17(2):417, 1969.
[13] ISO/IEC 14496-10, International Standard of Joint Video Specification. Coding of Audiovisual Objects-Part 10: Advanced Video Coding, 2003.
[14] Amy W. Lim, Gerald I. Cheong, and Monica S. Lam. An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication. Proceedings of the 13th international conference on Supercomputing, pages
228–237, 1999.
[15] Amy W. Lim and Monica S. Lam. Maximizing Parallelism and Minimizing Synchronization with affine Transforms. Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 201–214, 1997.
[16] A.A.Moinuddin, E. Khan, and F.Ghani. An Efficient Technique for Storage of Two-Tone Images. IEEE Transactions on Consumer Electronics, 43(4), Nov. 1997.
[17] Michael J. Quinn. Parallel Programming in C with MPI and OpenMP. McGraw-Hill, 2003.
[18] Iain E.G. Richardson. H.264 and MPEG-4 Video Compression. Wiley, 1 edition, Aug. 2003. ISBN 0-470-84837-5.
[19] Michael Roitzsch. Slice-Balancing H.264 Video Encoding for Improved Scalability of Multicore Decoding. Proceedings of the 7th ACM & IEEE international conference on Embedded software, pages 269–278, 2007.
[20] Hao Sung. A Skip-line with Threshold Technique for Binary Image Compression. Master Thesis, Fu Jen Catholic University, Taipei, Taiwan 106, R.O.C, Jul. 2008.
[21] Sung-Wen Wang, Ya-Ting Yang, Chia-Ying Li, Yi-Shin Tung, and Ja-Ling Wu. The Optimization of H.264/AVC Baseline Decoder on Low-Cost TriMedia DSP Processor. Proceeding of SPIE, 5558, 2004.
[22] Shu-Sian Yang, Sung-WenWang, and Ja-LingWu. A Parallel Algorithm for H.264/AVC Deblocking Filter Based on Limited Error Propagation Effect. In IEEE International Conference on Multimedia and Expo, pages 1858–1861, Jul. 2007.
[23] Xiaosong Zhou, Eric Q. Li, and Yen-Kuang Chen. Implementation of H.264 Decoder on General-Purpose Processors with Media Instructions. Proceeding of SPIE Conference on Image and Video Communication and Process-
ing, 5022, Jan. 2003.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43866-
dc.description.abstract在多核心系統上,使用多執行緒的方式來加速效能是相當常見的一種方法。儘管如此,在很多簡單的應用程式中,增加執行緒的數目反而使得效能下降,與預期結果並不相符。而通常使用者都會覺得是建立與結束執行緒所造成的額外花費。然而,在我們的觀察中,其最重要影響的原因是在於執行緒的分配。因此,在本論文中,我們討論執行緒的相關問題並提出一套新穎的使用者分配機制 (UDispatch) 來解決。因為執行緒是執行於使用者空間,並不能直接控制核心空間的系統資源。因此,我們利用一個虛擬裝置來作為兩空間的的橋樑,以避免直接修改核心增加系統呼叫,透過其可很有效率且可攜的與作業系統溝通。另外,我們提供 UDispatch 相關的程式應用介面 (API) 供使用者直接於應用程式原始碼中使用。雖然 API 能幫助使用者,然而,因為一些原因,有時候使用者並不想修改原始碼或沒辦法修改原始碼,使得 UDispatch 的好處沒辦法顯現。因此,我們又提出了一命令列的程式應用介面 ─ 使用者分配機制載入器 (UDLoader),其可幫助使用者操作 UDispatch 而不需修改程式碼。我們也將 UDispatch 與 UDLoader 實驗於兩個多媒體應用程式:跳行二元壓縮應用程式與 H.264/AVC 解碼器。其實驗結果顯示,在跳行二元壓縮應用程式在四核心與八核心機器上分別有 171.8% 與 111.6% 的增進。而 H.264/AVC 解碼器在四核心機器上則有 20.1% 的提昇。zh_TW
dc.description.abstractIn multicore environment, using multiple threads is a common useful approach to improve application performance. Nevertheless, even in many simple applications, the performance might degrade when the number of threads increases. Users usually impute this phenomenon to the overhead of creation or termination of threads. However, in our observation, the more significant effect is the dispatching of threads. We discuss the problems on using threads, and present a novel User Dispatching Mechanism (UDispatch) that provides controllability in user space to improve application performance. Since user threads cannot directly control system resources, a virtual device is adopted between user space and operating system for portability and efficiency instead of adding new system calls through kernel modification. We provide an application programming interface (API) for users to manipulate UDispatch through modification of application source codes. To avoid source code modification, a command-line UDispatch Loader (UDLoader) is also provided to help users bind threads to specific cores directly. We implement UDispatch on two multimedia applications of multi-threading. The results show that a skip-line application speeds up to 171.8% and 111.6% on a 4-core machine and an 8-core machine, respectively, and an optimized H.264/AVC decoder speeds up to 20.1% on a 4-core machine.en
dc.description.provenanceMade available in DSpace on 2021-06-15T02:31:01Z (GMT). No. of bitstreams: 1
ntu-98-R96944013-1.pdf: 762654 bytes, checksum: f927085073a84e9f9bb3c3947ff5291c (MD5)
Previous issue date: 2009
en
dc.description.tableofcontents1 Introduction 1
2 Related Work and Background 5
2.1 Threading Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Scheduling Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Main-Thread Anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Why We Need a User Dispatching Mechanism ? . . . . . . . . . . . . . . . . . . 10
3 User Dispatching Mechanism 12
3.1 UDispatch Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Examples of Using UDispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Binding Threads to Cores . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.2 Finding the Core with the Least Load . . . . . . . . . . . . . . . . . . . 16
3.3 Comparison with Other Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 sched set/getaffinity() System Calls . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 proc Filesystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Shell Commands in UDispatch 22
4.1 UDispatch Module Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 UDispatch Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Customized UDispatch Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5 Experiments 29
5.1 Skip-line Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.1 Anomaly Occurrence Frequency in Skip-line Applications . . . . . . . . . 40
5.1.2 Skip-line Application with UDL . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 H.264 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2.1 Anomaly Occurrence Frequency in H.264 Applications . . . . . . . . . . 53
5.2.2 H.264 Decoder with UDL . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Comparison between UDL and UDA . . . . . . . . . . . . . . . . . . . . . . . . 57
6 Conclusion 59
7 Future Work 61
Appendix 61
A UDispatch API 62
A.1 open UDispatch() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
A.2 close UDispatch() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.3 bind to cpu() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.4 auto bind() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.5 get bind info len() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.6 get bind info() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
A.7 get nr threads() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
A.8 get pids on task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
A.9 get nr cpus() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.10 get cpuload() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.11 get taskload() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A.12 get nr tasks on cpu() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A.13 get pids on cpu() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
A.14 UDispatch ctl() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.15 set UDL pids() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
A.16 enable UDLswitch() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
A.17 disable UDLswitch() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
B Low-Level Kenrnel API 76
B.1 set cpus allowed ptr() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
B.2 num online cpus() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
C Shell Commands in UDispatch 79
C.1 UDispatch Module Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
C.2 UDispatch Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Bibliography 82
dc.language.isoen
dc.subject載入器zh_TW
dc.subject執行緒zh_TW
dc.subject排程zh_TW
dc.subject分配zh_TW
dc.subject異常現象zh_TW
dc.subject多核心zh_TW
dc.subject系統呼叫zh_TW
dc.subject虛擬裝置zh_TW
dc.subjectSchedulingen
dc.subjectLoaderen
dc.subjectVirtual Deviceen
dc.subjectSystem Callen
dc.subjectMulticoreen
dc.subjectAnomalyen
dc.subjectThreadingen
dc.subjectDispatchingen
dc.title一套於多核心系統上具可攜性且有效率的使用者分配資源機制zh_TW
dc.titleA Portable and Efficient User Dispatching Mechanism for Multicore Systemsen
dc.typeThesis
dc.date.schoolyear97-2
dc.description.degree碩士
dc.contributor.oralexamcommittee張榮貴,羅習五
dc.subject.keyword執行緒,排程,分配,異常現象,多核心,系統呼叫,虛擬裝置,載入器,zh_TW
dc.subject.keywordThreading,Scheduling,Dispatching,Anomaly,Multicore,System Call,Virtual Device,Loader,en
dc.relation.page89
dc.rights.note有償授權
dc.date.accepted2009-08-17
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-98-1.pdf
  未授權公開取用
744.78 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved