Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23873
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor吳家麟(Ja-Ling Wu)
dc.contributor.authorChia-Hao Changen
dc.contributor.author張嘉豪zh_TW
dc.date.accessioned2021-06-08T05:11:57Z-
dc.date.copyright2006-07-28
dc.date.issued2006
dc.date.submitted2006-07-22
dc.identifier.citation[1] Call for Proposals on Spatial Audio Coding. ISO/IEC JTC1/SC29/WG11 (MPEG)
Document N6455., Munich, March 2004.
[2] Workpaln for MPEG-4 Spatial Audio Coding. ISO/IEC JTC1/SC29/WG11
(MPEG), Document N6814., Palma de Mallorca, October 2004.
[3] B.S. Atal and S.L. Hanauer. Speech analysis and synthesis by linear prediction of
the speech wave. Journal of the Acoustical Society of America, 50(2):637{55, 1971.
[4] F. Baumgarte and C. Faller. Binaural cue coding-Part I: psychoacoustic funda-
mentals and design principles. IEEE Transactions on Speech and Audio Processing,
11(6):509{519, 2003.
[5] J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization, Re-
vised Edition. Cambridge, MA: MIT Press, 1997.
[6] J. Breebaart, J. Herre, C. Faller, J. Roden, F. Myburg, S. Disch, H. Purnhagen,
G. Hotho, M. Neusinger, , K. Kj}orling, and W. Oomen. MPEG Spatial Audio Coding
- MPEG Surround overview and current status. Proc. 119th AES convention, New
York, USA, Oct, 2005.
[7] J. Breebaart, S. van de Par, A. Kohlrausch, and E. Schuijers. High-quality parametric
spatial audio coding at low bitrates. Proc. 116th AES Convention, Berlin, Germany,
May.
[8] J. Breebaart, S. van de Par, A. Kohlrausch, and E. Schuijers. Parametric Coding of
Stereo Audio. EURASIP Journal on Applied Signal Processing, 9:1305{1322, 2005.
[9] A. Bruekers, AWJ Oomen, and RJ van der Vleuten. Lossless coding for DVD audio.
101st AES Convention, 1996.
[10] C. Cellier, P. Ch^enes, and M. Rossi. Lossless Audio Data Compression for Real Time
Applications. 95th AES Convention, 1993.
[11] C.H. Chang, J.H. Kuo, C.H. Wu, and J.L. Wu. Investigation And Complexity
Analysis of A Spatial Audio Codec Based on A Programmable Media-Processor.
International Conference on Consumer Electronics, 2006. ICCE'06. 2006 Digest of
Technical Papers., pages 281{282, 2006.
[12] RI Chernyak and NA Dubrovsky. Pattern of the noise images and the binaural
summation of loudness for the di®erent interaural correlation of noise. Proc. 6th Int.
Congr. Acoustics, 1.
[13] M. Dietz, L. Liljeryd, K. Kjorling, and O. Kunz. Spectral band replication, a novel
approach in audio coding. Proc. AES Convention.
[14] C. Faller and F. Baumgarte. E±cient representation of spatial audio using perceptual
parametrization. 2001 IEEE Workshop on the Applications of Signal Processing to
Audio and Acoustics, pages 199{202, 2001.
67
[15] C. Faller and F. Baumgarte. Binaural cue coding: a novel and e±cient representation
of spatial audio. Proceedings ICASSP'02 Acoustics Speech and Signal Processing,
2:1841{1844, 2002.
[16] C. Faller and F. Baumgarte. Binaural Cue Coding applied to stereo and multi-
channel audio compression. Preprint 112th Conv. Aud. Eng. Soc, 2002.
[17] C. Faller and F. Baumgarte. Binaural cue coding-Part II: Schemes and applications.
IEEE Transactions on Speech and Audio Processing, 11(6):520{531, 2003.
[18] S.H. Godsill and PJ Rayner. Digital Audio Restoration: A Statistical Model Based
Approach. Springer-Verlag New York, Inc. Secaucus, NJ, USA, 1998.
[19] J.W. Hall and M.A. Fernandes. The role of monaural frequency selectivity in binaural
analysis. The Journal of the Acoustical Society of America, 76:435, 1984.
[20] J. Herre, H. Purnhagen, J. Breebaart, C. Faller, S. Disch, K. Kj}orling, E. Schuijers,
J. Hilpert, and F. Myburg. The Reference Model Architecture for MPEG Spatial
Audio Coding. Proc. 118th AES convention, Barcelona, Spain, May, 2005.
[21] C. Spenger-J. Hilpert K. Linzmeier Jurgen Herre, Christof Faller. CT/Philips Con-
tribution to CfP on Spatial Audio Coding. ISO/IEC JTC1/SC29/WG11 (MPEG),
Document M11001., Redmon, July 2004.
[22] C. Spenger-J. Hilpert K. Linzmeier Jurgen Herre, Christof Faller. Fraunhofer/Agere
Submission to Spatial Audio CfP. ISO/IEC JTC1/SC29/WG11 (MPEG), Docu-
ment M11075., Redmon, July 2004.
[23] WB Kleijn and KK Paliwal. An introduction to Speech coding. Speech Coding and
Synthesis, pages 1{47, 1995.
[24] RG Klumpp and HR Eady. Some Measurements of Interaural Time Di®erence
Thresholds. The Journal of the Acoustical Society of America, 28:859, 1956.
[25] B. Kollmeier and I. Holube. Auditory ‾lter bandwidths in binaural and monaural
listening conditions. The Journal of the Acoustical Society of America, 92:1889,
1992.
[26] Matt Fellers-Grant Davidson Mark Vinton, Mark Davis. Dolby Laboratories Sub-
mission to CfP on MPEG-4 Spatial Audio Coding. ISO/IEC JTC1/SC29/WG11
(MPEG), Document M11090., Redmon, July 2004.
[27] Ching-Hua Ng Yoshiaki Takagi Kojiro Ono Mineo Tsushima Naoya Tanaka, Kok-
Seng Chong. Technical Description and Performance Test Results of Panasonic
Spatial Audio Coding. ISO/IEC JTC1/SC29/WG11 (MPEG), Document M11015.,
Redmon, July 2004.
[28] A. Papoulis. Probability, random variables and stochastic processes. New York:
McGraw-Hill, 1984, 2nd ed., 1984.
[29] E. Schuijers, J. Breebaart, H. Purnhagen, and J. Engdegºard. Low complexity para-
metric stereo coding. Preprint 117th Conv. Aud. Eng. Soc., May, 2004.
[30] J.O Smith III and J.S Abel. Bark and ERB bilinear transforms. IEEE Transactions
on Speech and Audio Processing, 7(6):697{708, 1999.
[31] E. Swicker and H. Fastl. Psychoacoustics-Facts and Models. Springer Publications,
Berlin and New York, 1999.
[32] M. van der Heijden and C. Trahiotis. Binaural detection as a function of interaural
correlation and bandwidth of masking noise: Implications for estimates of spectral
resolution. The Journal of the Acoustical Society of America, 103:1609, 1998.
[33] RG van der Waal and RNJ Veldhuis. Subband coding of stereophonic digital au-
dio signals. International Conference on Acoustics, Speech, and Signal Processing,
ICASSP-91., pages 3601{3604, 1991.
[34] WA Yost and ER Hafter. Lateralization. Berlin: Springer, 1987.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23873-
dc.description.abstract數位家庭是未來資訊產業最重要的趨勢,由DVD 多聲道劇院音訊開始,相
信多聲道且低位元率的音訊必定是下一世代消費性電子需求的主流。除此之外,
也有愈來愈多新的多聲道音訊的應用產生,例如手持式的裝置或是車用系統,甚
至是數位音訊的廣播服務。這些新的應用都非常需要有一個好的音訊壓縮的方法
來達到多聲道、高品質且低位元率的需求。
事實上傳統的音效壓縮方式通常會隨著聲道的數目成等比例的增加,而以往
對多聲道音訊壓縮的演算法的位元率還是比單聲道音訊高得多,所以一種新的多
聲道音訊壓縮的演算法因應而生,那就是-參數化空間音訊編解碼演算法。參數
化空間音訊編解碼演算法能夠將多聲道的音訊做非常好的壓縮,使得位元率機乎
和單聲道音訊相同。
本論文將會深入研究探討參數化空間音訊編解碼演算法,在論文的前半部會
探討包括在空間聽覺上的學術背景以及各種參數化空間音訊編解碼演算法之間
的不同。而在論文的後半部會針對低複雜度、高音訊品質的參數化空間音訊編解
碼演算法的實做方式做深入討論。
zh_TW
dc.description.abstractThe trend of information technology (IT) converges toward the digital home,
driven by the DVD movie multi-channel audio sound, next generation digital
multimedia content with high quality but low bit-rate multi-channel audio is surely to
be the main stream in the near future. Furthermore, lots of the new multi-channel
audio applications emerged in these years. For example, mobile applications, portable
devices, automotive applications, such as car audio playback setups even digital audio
broadcasting services, drive the demand of multichannel audio playback. A carefully
designed audio coding mechanism which minimizes transmission costs or provides
cost-efficient storage and gains best surround performance is therefore, of interest.
In fact, the complexity of the traditional audio coding schemes scale with the
number of audio channels, and the bitrate of legacy multi-channel coding techniques
are still considerable higher than mono signal. Under this circumstance, a new
imperative evolution of the digital audio coding scheme, the Parametric Spatial Audio
Coding, is proposed. The Parametric Spatial Audio Coding is a promising approach
which enables the transmission of multi-channel audio signals at data rates close to
the rates used for the representation of mono or stereo (or more) audio signals.
A thorough investigation of Parametric Spatial Audio Coding including spatial
hearing backgrounds, differences between Parametric Spatial Audio Coding
algorithms will be illustrated in the former part of this thesis. In the later part, it will
present a detail discussion on low complexity with satisfactory listening quality
implementation of a Parametric Spatial Audio Coding scheme.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T05:11:57Z (GMT). No. of bitstreams: 1
ntu-95-R93944006-1.pdf: 1500137 bytes, checksum: 1537d5c8b17d3322c9606b56e206fc0a (MD5)
Previous issue date: 2006
en
dc.description.tableofcontents1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
1.1 Why Parametric Spatial Audio Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Audio Coding Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Motivation and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Spatial Hearing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Spatial Cues for Interaural Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Spatial Hearing Phenomena Related to Inter-channel Cues . . . . . . . . . . . . . . . . . . 14
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Parametric Spatial Audio Coding Algorithms : : : : : : : : : : : : : : : : : : : : : : : 17
3.1 Related Stereo and Multi-channel Audio Coding Techniques . . . . . . . . . . . . . . . . . 17
3.1.1 Joint Stereo Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.2 Parametric Stereo Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.3 Matrixed Surround Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Binaural Cue Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.1 BCC Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.2 BCC Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
i
3.3 MPEG Surround . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 MPEG Surround Standardization Process . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.2 MPEG Surround RM0 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Implementation of a Parametric Spatial Audio Codec : : : : : : : : : : : : : : : : : : 30
4.1 Sub-band Domain Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.1 Time to Frequency Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.2 Sub-band Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Channel Downmixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 Spatial Cue Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.1 Spatial Cue Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.2 Spatial Cue Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.3 Spatial Cue Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Hiss Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.5 Conclusion and Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Performance and Subjective Evaluation : : : : : : : : : : : : : : : : : : : : : : : : : : 48
5.1 Evaluation of the Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Evaluation of the Listening Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Evaluation of the Spatial Cues Bitrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6 Conclusion and Future Works : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 62
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
References : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
dc.language.isoen
dc.title參數化空間音訊編解碼器的研究及其即時軟體實作zh_TW
dc.titleAn Investigation and Software Implementation of Parametric Spatial Audio Codecsen
dc.typeThesis
dc.date.schoolyear94-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳宏銘(Homer H. Chen),許超雲,童怡新(Yi-Shin Tung)
dc.subject.keyword空間音訊,資料壓縮,多聲道,zh_TW
dc.subject.keywordspatial audio,data compression,multi-channel,en
dc.relation.page70
dc.rights.note未授權
dc.date.accepted2006-07-22
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-95-1.pdf
  目前未授權公開取用
1.46 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved