Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42230
標題: 以數位語音處理技術解決異質視訊會議之同步問題
On Using Digital Speech Processing Techniques for Synchronization among Heterogeneous Teleconferencing Devices
作者: Hsiao-Pu Lin
林孝蒲
指導教授: 謝宏昀(Hung-Yun Hsieh)
關鍵字: 數位語音處理技術,語音影像同步,異質網路,視訊會議,
Digital Speech Processing,Audio/Video Synchronization,Heterogeneous Network,Teleconferencing,
出版年 : 2008
學位: 碩士
摘要: As the popularity of multi-functional telephony devices grows, traditional audio conference now may involve heterogeneous teleconferencing devices, including POTS
phone, dual-mode smart phones, pocket PCs, and so on. Among these conferencing devices, some may have the capability of accessing IP networks and supporting video conferencing with peer devices in the audio conference so as to have better conferencing experience. In this scenario, it becomes necessary to synchronize between audio streams, traversed the PSTN network, and video streams, traversed the IP network.
While related work has investigated the problem of audio/video synchronization, their scenario is limited to the synchronization within homogeneous network, hence they
cannot be applied in the target scenario.
Therefore, in this thesis we propose an end-to-end framework for audio/video synchronization. We then simplify the problem as one that requires only synchronization between PSTN and IP audio streams. We first employ a time-domain algorithm based
on cross correlation and identify its ineffectiveness in synchronizing distorted audio streams, due to noises or packet losses. Hence, we seek to extract distortion-tolerant
audio features by Digital Speech Processing techniques for synchronization. We apply MFCC in the synchronization algorithm and obtain respectable performance for audio streams distorted by codec and packet losses. However, MFCC is inherently vulnerable to overlapping speakers. Therefore, we leverage the sparsity of speeches in spectrograms to design the spectrogram-based synchronization algorithm, and achieve favorable performance for speech mixtures and noisy speech. Evaluation results show that using DSP techniques is helpful in solving the synchronization problem across
PSTN audio streams and IP video streams in terms of accuracy and robustness.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42230
全文授權: 有償授權
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-97-1.pdf
  目前未授權公開取用
1.58 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved