請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97312完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 葉丙成 | zh_TW |
| dc.contributor.advisor | Ping-Cheng Yeh | en |
| dc.contributor.author | 謝文崴 | zh_TW |
| dc.contributor.author | Wen-Wei Hsieh | en |
| dc.date.accessioned | 2025-04-24T16:05:35Z | - |
| dc.date.available | 2025-04-25 | - |
| dc.date.copyright | 2025-04-24 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2025-03-20 | - |
| dc.identifier.citation | [1] Dive into Deep Learning. 10.1 long short-term memory (lstm). https://d2l.ai/chapter_recurrent-modern/lstm.html, 2024. Accessed: 2024-10-18.
[2] A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017. [3] Yassine El Kheir, Ahmed Ali, and Shammur Absar Chowdhury. Automatic pronunciation assessment-a review. In The 2023 Conference on Empirical Methods in Natural Language Processing. [4] Kun Li, Xiaojun Qian, and Helen Meng. Mispronunciation detection and diagnosis in l2 english speech using multidistribution deep neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(1):193–207, 2016. [5] Nancy F Chen, Vivaek Shivakumar, Mahesh Harikumar, Bin Ma, and Haizhou Li. Large-scale characterization of mandarin pronunciation errors made by native speakers of european languages. In Interspeech, pages 2370–2374, 2013. [6] Yi-hsiu Lai. Asymmetry in mandarin affricate perception by learners of mandarin chinese. Language and cognitive processes, 24(7-8):1265–1285, 2009. [7] Xinchun Wang and Jidong Chen. The acquisition of mandarin consonants by english 87 learners: The relationship between perception and production. Languages, 5(2):20, 2020. [8] Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460, 2020. [9] Julian K Wheatley. Learning Chinese: A foundation course in Mandarin. Yale University Press New Haven and London, 2011. [10] San Duanmu. The phonology of standard Chinese. Oxford University Press, 2007. [11] Tsu—lin Mei. Tones and tone sandhi in 16th century mandarin. Journal of Chinese Linguistics, pages 237–260, 1977. [12] CHAO Yuan Ren. Tone and intonation in chinese. Bulletin of the Institute of History and Philology, 4:121–134, 1933. [13] Nancy F Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, and Haizhou Li. icall corpus: Mandarin chinese spoken by non-native speakers of european descent. In INTERSPEECH, pages 324–328, 2015. [14] Guowen Shang and Shouhui Zhao. Singapore mandarin: Its positioning, internal structure and corpus planning. In Paper presented atthe 22nd Annual Conference of the Southeast Asian Linguistics Society, Agay, France, 2012. [15] XIAO ZHANG. Latic: A non-native pre-labelled mandarin chinese validation corpus for automatic speech scoring and evaluation task. 2021. [16] Wolfgang Menzel, Eric Atwell, Patrizia Bonaventura, Daniel Herron, Peter Howarth, Rachel Morton, and Clive Souter. The isle corpus of non-native spoken english. In Proceedings of LREC 2000: Language Resources and Evaluation Conference, vol. 2, pages 957–964. European Language Resources Association, 2000. [17] Nobuaki Minematsu, Koji Okabe, Keisuke Ogaki, and Keikichi Hirose. Measurement of objective intelligibility of japanese accented english using erj (english read by japanese) database. In INTERSPEECH, pages 1481–1484, 2011. [18] Guanlong Zhao, Evgeny Chukharev-Hudilainen, Sinem Sonsaat, Alif Silpachai, Ivana Lucic, Ricardo Gutierrez-Osuna, and John Levis. L2-arctic: A non-native english speech corpus. Proc. Interspeech 2018, page 2783–2787, 2018. [19] Chiranjeevi Yarra, Aparna Srinivasan, Chandana Srinivasa, Ritu Aggarwal, and Prasanta Kumar Ghosh. voistutor corpus: A speech corpus of indian l2 english learners for pronunciation assessment. In 2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), pages 1–6. IEEE, 2019. [20] Yu Chen, Jun Hu, and Xinyu Zhang. Sell-corpus: an open source multiple accented chinese-english speech corpus for l2 english learning assessment. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 7425–7429. IEEE, 2019. [21] Jazmín Vidal, Luciana Ferrer, and Leonardo Brambilla. Epadb: A database for development of pronunciation assessment systems. In INTERSPEECH, pages 589–593, 2019. [22] Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, and Yujun Wang. speechocean762: An open-source non-native english speech corpus for pronunciation assessment. arXiv preprint arXiv:2104.01378, 2021. [23] Mohammed Algabri, Hassan Mathkour, Mansour Alsulaiman, and Mohamed A Bencherif. Mispronunciation detection and diagnosis with articulatory-level feedback generation for nonㄦnative arabic speech. Mathematics, 10(15):2727, 2022. [24] Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali, Hamdy Mubarak, and Shazia Afzal. Speechblender: Speech augmentation framework for mispronunciation data generation. arXiv preprint arXiv:2211.00923, 2022. [25] Helen Meng, Yuen Yee Lo, Lan Wang, and Wing Yiu Lau. Deriving salient learners' mispronunciations from cross-language phonological comparisons. In 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pages 437–442. IEEE, 2007. [26] Natalia Cylwik, Agnieszka Wagner, and Grazyna Demenko. The euronounce corpus of non-native polish for asr-based pronunciation tutoring system. In SLaTE, pages 85–88, 2009. [27] Mingxing Li, Shuang Zhang, Kun Li, Alissa M Harrison, Wai-Kit Lo, and Helen Meng. Design and collection of an l2 english corpus with a suprasegmental focus for chinese learners of english. In ICPhS, pages 1210–1213, 2011. [28] Huizhong Yang and Naixing Wei. Construction and data analysis of a chinese learner spoken english corpus, 2005. [29] Minglin Wu, Kun Li, Wai-Kim Leung, and Helen Meng. Transformer based end-to-end mispronunciation detection and diagnosis. In Interspeech, pages 3954–3958, 2021. [30] Agnieszka Wagner. Rhythmic structure of utterances in native and non-native polish. In Proceedings of speech prosody, pages 337–341, 2014. [31] Dean Luo, Xuesong Yang, and Lan Wang. Improvement of segmental mispronunciation detection with prior knowledge extracted from large l2 speech corpus. In Twelfth Annual Conference of the International Speech Communication Association, 2011. [32] Kun Li, Shaoguang Mao, Xu Li, Zhiyong Wu, and Helen Meng. Automatic lexical stress and pitch accent detection for l2 english speech using multi-distribution deep neural networks. Speech Communication, 96:28–36, 2018. [33] Guimin Huang, Qiupu Chen, and Hongtao Zhu. A mispronunciation detection method of confusing vowel pair for chinese students. In Journal of Physics: Conference Series, volume 1693, page 012102. IOP Publishing, 2020. [34] Mostafa Ali Shahin, Julien Epps, and Beena Ahmed. Automatic classification of lexical stress in english and arabic languages using deep learning. In Interspeech, pages 175–179, 2016. [35] Luciana Ferrer, Harry Bratt, Colleen Richey, Horacio Franco, Victor Abrash, and Kristin Precoda. Classification of lexical stress using spectral and prosodic features for computer-assisted language learning systems. Speech Communication, 69:31–45, 2015. [36] Yoon Kim, Horacio Franco, and Leonardo Neumeyer. Automatic pronunciation scoring of specific phone segments for language instruction. In Eurospeech, pages 645–648, 1997. [37] Feng Zhang, Chao Huang, Frank K Soong, Min Chu, and Renhua Wang. Automatic mispronunciation detection for mandarin. In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 5077–5080. IEEE, 2008. [38] Jiatong Shi, Nan Huo, and Qin Jin. Context-aware goodness of pronunciation for computer-assisted pronunciation training. arXiv preprint arXiv:2008.08647, 2020. [39] Kavita Sheoran, Arpit Bajgoti, Rishik Gupta, Nishtha Jatana, Geetika Dhand, Charu Gupta, Pankaj Dadheech, Umar Yahya, and Nagender Aneja. Pronunciation scoring with goodness of pronunciation and dynamic time warping. IEEE Access, 11:15485–15495, 2023. [40] Wei Li, Nancy F Chen, Sabato Marco Siniscalchi, and Chin-Hui Lee. Improving mispronunciation detection for non-native learners with multisource information and lstm-based deep models. In Interspeech, pages 2759–2763, 2017. [41] Konstantinos Kyriakopoulos, Kate M Knill, and Mark JF Gales. A deep learning approach to assessing non-native pronunciation of english using phone distances. ISCA, 2018. [42] Wai-Kim Leung, Xunying Liu, and Helen Meng. Cnn-rnn-ctc based end-to-end mispronunciation detection and diagnosis. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8132–8136. IEEE, 2019. [43] Chuanbo Zhu, Takuya Kunihara, Daisuke Saito, Nobuaki Minematsu, and Noriko Nakanishi. Automatic prediction of intelligibility of words and phonemes produced orally by japanese learners of english. In 2022 IEEE Spoken Language Technology Workshop (SLT), pages 1029–1036. IEEE, 2023. [44] Daniel Korzekwa, Roberto Barra-Chicote, Szymon Zaporowski, Grzegorz Beringer, Jaime Lorenzo-Trueba, Alicja Serafinowicz, Jasha Droppo, Thomas Drugman, and Bozena Kostek. Detection of lexical stress errors in non-native (l2) english with data augmentation and attention. arXiv preprint arXiv:2012.14788, 2020. [45] Qiang Gao, Shutao Sun, and Yaping Yang. Tonenet: A cnn model of tone classification of mandarin chinese. In Interspeech, pages 3367–3371, 2019. [46] Wei Li, Nancy F Chen, Sabato Marco Siniscalchi, and Chin-Hui Lee. Improving mispronunciation detection of mandarin tones for non-native learners with soft-target tone labels and blstm-based deep tone models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12):2012–2024, 2019. [47] Eesung Kim, Jae-Jin Jeon, Hyeji Seo, and Hoon Kim. Automatic pronunciation assessment using self-supervised speech representation learning. arXiv preprint arXiv:2204.03863, 2022. [48] Binghuai Lin and Liyuan Wang. Exploiting information from native data for non-native automatic pronunciation assessment. In 2022 IEEE Spoken Language Technology Workshop (SLT), pages 708–714. IEEE, 2023. [49] Mu Yang, Kevin Hirschi, Stephen D Looney, Okim Kang, and John HL Hansen. Improving mispronunciation detection with wav2vec2-based momentum pseudo-labeling for accentedness and intelligibility assessment. arXiv preprint arXiv:2203.15937, 2022. [50] Institut National des Langues et Civilisations Orientales (INALCO). Home page. https://www.inalco.fr/, 2024. Accessed: 2024-10-25. [51] Python Software Foundation. Tkinter – Python interface to Tcl/Tk. https://docs.python.org/3/library/tkinter.html, 2024. Accessed: 2024-10-25. [52] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015. [53] Yoshua Bengio, Patrice Simard, and Paolo Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994. [54] Hasim Sak, Andrew W Senior, and Françoise Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. 2014. [55] Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, et al. A comparative study on transformer vs rnn in speech applications. In 2019 IEEE automatic speech recognition and understanding workshop (ASRU), pages 449–456. IEEE, 2019. [56] Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe, et al. Self-supervised speech representation learning: A review. IEEE Journal of Selected Topics in Signal Processing, 16(6):1179–1210, 2022. [57] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013. [58] Herve Jegou, Matthijs Douze, and Cordelia Schmid. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence, 33(1):117–128, 2010. [59] Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. Librispeech: an asr corpus based on public domain audio books. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 5206–5210. IEEE, 2015. [60] Ankita Pasad, Ju-Chieh Chou, and Karen Livescu. Layer-wise analysis of a self-supervised speech representation model. In 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 914–921. IEEE, 2021. [61] Leonardo Pepino, Pablo Riera, and Luciana Ferrer. Emotion recognition from speech using wav2vec 2.0 embeddings. Proc. Interspeech 2021, page 3400–3404, 2021. [62] Zhou Yu, Vikram Ramanarayanan, David Suendermann-Oeft, Xinhao Wang, Klaus Zechner, Lei Chen, Jidong Tao, Aliaksei Ivanou, and Yao Qian. Using bidirectional lstm recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 338–345. IEEE, 2015. [63] Lei Chen, Jidong Tao, Shabnam Ghaffarzadegan, and Yao Qian. End-to-end neural network based automated speech scoring. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 6234–6238. IEEE, 2018. [64] Israel Cohen, Yiteng Huang, Jingdong Chen, Jacob Benesty, Jacob Benesty, Jing-dong Chen, Yiteng Huang, and Israel Cohen. Pearson correlation coefficient. Noise reduction in speech processing, pages 1–4, 2009. [65] SciPy developers. pearsonr – SciPy v1.14.1 manual. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html, 2024. Accessed: 2024-11-01. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97312 | - |
| dc.description.abstract | 電腦輔助發音訓練在近年來有長足的發展,其有兩個主要用途,發音評分與教學,而影響發音的因素有準確度、語調、韻律等,這些都是較難量化的標準,除此之外,華語發音的文獻更是稀少且缺少資料集,大部分都是針對英語的研究。為了解決這問題,本論文有主要的兩個貢獻,一是建立公開、開源、專業的華語語音資料集,此資料集收集了法籍學習者的語料,並經由有華語教學背景、相關實習經驗的專家進行評分,此外,也一同開源評分系統,可供未來想建立更龐大的語料庫的研究者做參考。透過此資料集,學者們能做模型訓練,並針對華語發音評分做進一步的改進與優化。其次,本論文另一個貢獻在於應用了自督導式學習的機器學習模型在此資料集,在資料有限的情況下能有不錯的準確率,也比較了不同的自督導式學習模型和傳統方法做特徵萃取,並結合我們為語音設計的深度學習模型做訓練,提供了基準讓後續的研究者能夠做參考。總體而言,本研究不僅建立了開源、專業的華語語音資料集,讓研究者能應用此資料集做更多相關的研究,同時也開源評分系統,希望能讓更多人投入在華語語料的建立,最後透過應用了自督導式學習的機器學習模型,在此資料集上有不錯的表現,為後續的研究建立的基準。 | zh_TW |
| dc.description.abstract | This paper addresses significant advancements in computer-aided pronunciation training, highlighting its two primary applications: pronunciation scoring and instructional support. Factors like accuracy, intonation, and prosody significantly affect pronunciation, but these elements are often difficult to measure. Additionally, literature on Mandarin pronunciation is notably scarce and lacks substantial datasets, with most research focusing on English.
To address these issues, this thesis makes two main contributions. First, it establishes an open-source, professionally curated Mandarin speech dataset, which includes recordings from French learners and has been evaluated by experts with backgrounds in Mandarin teaching and relevant practical experience. Additionally, an open-source scoring system has been developed to serve as a reference for researchers aiming to create larger corpora. This dataset allows scholars to train models and improve and optimize Mandarin pronunciation assessments. Secondly, the thesis employs self-supervised learning models on this dataset, achieving commendable accuracy despite the limited data available. It compares different self-supervised learning models and traditional methods for feature extraction, integrating them with our specially designed deep learning model for voice training. This method provides a benchmark for future researchers to reference. This study establishes an open-source, professional Mandarin speech dataset, facilitating further research in this area. It also shares the scoring system, encouraging more contributions to the development of Mandarin corpora. The successful application of self-supervised learning models on this dataset establishes a strong foundation for subsequent research. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-04-24T16:05:35Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-04-24T16:05:35Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員審定書 i
誌謝 iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Chapter 1 Introduction 1 1.1 Research Background . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Research Purposes and Main Contributions . . . . . . . . . . . . . . 4 1.4 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2 Related Literature 7 2.1 Mandarin Phonological Structure . . . . . . . . . . . . . . . . . . . 7 2.1.1 Consonants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Vowel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Tone and Intonation . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Research Related to the Pronunciation Assessment Corpora . . . . . 13 2.3 Research Related to the Pronunciation Assessment Algorithms . . . . 17 2.3.1 SVM and GMM in Pronunciation Assessment . . . . . . . . . . . . 17 2.3.2 GOP-Related Algorithms . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.3 End-to-End Pronunciation Assessment Algorithms . . . . . . . . . 19 2.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Chapter 3 Pronunciation Corpus Design for Mandarin Learners 23 3.1 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2 Speakers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Expert Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Scoring Assistance System . . . . . . . . . . . . . . . . . . . . . . . 28 3.5 Scoring Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Chapter 4 Deep Learning-Based Pronunciation Assessment Algorithm 37 4.1 Introduction of Deep Learning . . . . . . . . . . . . . . . . . . . . . 37 4.2 Multilayer Perceptrons (MLPs) . . . . . . . . . . . . . . . . . . . . 38 4.3 Long Time Short-Term Memory (LSTM) . . . . . . . . . . . . . . . 40 4.4 Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.5 Self-supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 45 4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Chapter 5 Experiments and Methods 49 5.1 Experimental Framework . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2 Training and Test Set . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.3 Model Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.3.1 Speech Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.3.2 Text Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.3.3 Training Module . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.4.1 Hardware Configuration and Software Dependencies . . . . . . . . 56 5.4.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 6 Experimental Results and Analysis 61 6.1 Experimental Evaluation Metrics . . . . . . . . . . . . . . . . . . . 61 6.1.1 Mean Squared Error (MSE) . . . . . . . . . . . . . . . . . . . . . . 62 6.1.2 Pearson Correlation Coefficient (PCC) . . . . . . . . . . . . . . . . 62 6.2 Comparison Among Different Feature Extractions . . . . . . . . . . 63 6.2.1 Overall Performance with Different Feature Extraction . . . . . . . 64 6.2.2 Feature Extraction with MFCC . . . . . . . . . . . . . . . . . . . . 64 6.2.3 Feature Extraction with Wav2Vec 2.0 Large-960h . . . . . . . . . . 68 6.2.4 Feature Extraction with Wav2Vec 2.0 Large-XLSR-53 . . . . . . . 71 6.2.5 Feature Extraction with Wav2Vec 2.0 Large-960h Tiered Feature Aggregation 75 6.3 Discussion and Analysis . . . . . . . . . . . . . . . . . . . . . . . . 79 6.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Chapter 7 Conclusions and Future Work 83 7.1 Conclusions 83 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 References 87 Appendix A — Scripts for the OMPAL Corpus 97 A.1 Text 1 of Level 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 A.2 Text 1 of Level 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 A.3 Text 2 of Level 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Appendix B — Grading Instructions 105 B.1 English version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 B.2 Chinese version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 | - |
| dc.language.iso | en | - |
| dc.subject | 第二語言 | zh_TW |
| dc.subject | 語料庫 | zh_TW |
| dc.subject | 電腦輔助發音訓練 | zh_TW |
| dc.subject | 華語 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | Mandarin | en |
| dc.subject | second language (L2) | en |
| dc.subject | deep learning | en |
| dc.subject | computer-aided pronunciation training (CAPT) | en |
| dc.subject | corpus | en |
| dc.title | 應用自督導式學習模型於自動化華語發音評估系統 | zh_TW |
| dc.title | Automatic Mandarin Pronunciation Assessment Applying Self-supervised Learning Models | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.coadvisor | 劉德馨 | zh_TW |
| dc.contributor.coadvisor | Te-Hsin Liu | en |
| dc.contributor.oralexamcommittee | 江振宇;李宏毅 | zh_TW |
| dc.contributor.oralexamcommittee | Chen-Yu Chiang;Hung-yi Lee | en |
| dc.subject.keyword | 語料庫,華語,第二語言,深度學習,電腦輔助發音訓練, | zh_TW |
| dc.subject.keyword | corpus,Mandarin,second language (L2),deep learning,computer-aided pronunciation training (CAPT), | en |
| dc.relation.page | 107 | - |
| dc.identifier.doi | 10.6342/NTU202500780 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-03-20 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 電信工程學研究所 | - |
| dc.date.embargo-lift | 2025-04-25 | - |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 9.21 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
