Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101162
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor張智星zh_TW
dc.contributor.advisorJyh-Shing Roger Jangen
dc.contributor.author陳威宇zh_TW
dc.contributor.authorWei-Yu Chenen
dc.date.accessioned2025-12-31T16:10:02Z-
dc.date.available2026-01-01-
dc.date.copyright2025-12-31-
dc.date.issued2025-
dc.date.submitted2025-12-15-
dc.identifier.citation[1] Jean Paul Cano et al. “Impact of a modified-live porcine reproductive and respiratory syndrome virus vaccine intervention on a population of pigs infected with a heterologous isolate”. In: Vaccine 25.22 (2007), pp. 4382–4391.
[2] Sara Ferrari et al. “Cough sound analysis to identify respiratory infection in pigs”. In: Computers and electronics in agriculture 64.2 (2008), pp.318–325.
[3] Jort F Gemmeke et al. “Audio set: An ontology and human-labeled dataset for audio events”. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE. 2017, pp. 776–780.
[4] Yuan Gong, Yu-An Chung, and James Glass. “Ast: Audio spectrogram transformer”. In: arXiv preprint arXiv:2104.01778 (2021).
[5] Shawn Hershey et al. “CNN architectures for large-scale audio classification”. In: 2017 ieee international conference on acoustics, speech and signal processing (icassp). IEEE. 2017, pp. 131–135.
[6] Yuting Hou et al. “Study on a pig vocalization classification method based on multi-feature fusion”. In: Sensors 24.2 (2024), p. 313.
[7] Robert J Langenhorst et al. “Development of a fluorescent microsphere immunoassay for detection of antibodies against porcine reproductive and respiratory syndrome virus using oral fluid samples as an alternative to serum-based assays”. In: Clinical and Vaccine Immunology 19.2 (2012), pp. 180–189.
[8] Karol J Piczak. “Environmental sound classification with convolutional neural networks”. In: 2015 IEEE 25th international workshop on machine learning for signal processing (MLSP). IEEE. 2015, pp. 1–6.
[9] Alec Radford et al. “Robust speech recognition via large-scale weak supervision”. In: International conference on machine learning. PMLR. 2023, pp. 28492–28518.
[10] Muhammad Umer Sheikh et al. “Bird Whisperer: Leveraging large pre-trained acoustic model for bird call classification”. In: Proc. Interspeech. Vol. 2024. 2024, pp. 5028–5032.
[11] Jake Snell, Kevin Swersky, and Richard Zemel. “Prototypical networks for few-shot learning”. In: Advances in neural information processing systems 30 (2017).
[12] Nicolas Turpault et al. “Sound event detection in domestic environments with weakly labeled data and soundscape synthesis”. In: Workshop on Detection and Classification of Acoustic Scenes and Events. 2019.
[13] Piper Wolters et al. “A study of few-shot audio classification”. In: arXiv preprint arXiv:2012.01573 (2020).
[14] Xuan Wu et al. “Combined spectral and speech features for pig speech recognition”. In: Plos one 17.12 (2022), e0276778.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101162-
dc.description.abstract本研究提出一個基於聲音的豬隻健康檢測系統,利用聲學事件來評估呼吸狀況,並實現畜牧業中的早期疾病檢測。我們透過放置在豬隻背部的聽診器捕捉呼吸聲音,並使用梅爾頻譜圖 (Mel spectrogram) 分析結合先進機器學習技術處理音頻數據。本研究解決了畜牧業聲學監測中的挑戰,包括嘈雜的音頻環境、呼吸資料的不足與類別不平衡,以及有限標註數據的問題。為克服這些限制,我們提出結合 k-means clustering 和 query attention 機制的 prototypical network 方法,並嘗試傳統機器學習模型,包括高斯混合模型 (GMM)、隱馬可夫模型 (HMM) 和梯度提升 (XGBoost, LightGBM) 等演算法作為對比模型。我們的實驗框架評估了多種編碼器架構,從傳統的特徵基礎方法到複雜的深度學習模型,包括卷積神經網路 (convolutional neural network, CNN)、Transformers 和四種預訓練聲學模型。本研究所提出的方法雖遜色於 baseline, 但其中 query attention 搭配 k-means 原型網路依舊展現優異性能,達到類似基準模型的效果,展現出模型成功區分不同的呼吸模式,實現非侵入性即時健康監測。zh_TW
dc.description.abstractThis study presents a sound-based health monitoring system for pigs using acoustic event detection to assess respiratory conditions and enable early disease detection in livestock farming. The system captures breathing sounds through stethoscopes placed on pigs' backs and processes the audio data using Mel spectrogram analysis combined with advanced machine learning techniques. We propose novel methodological approaches incorporating prototypical networks with k-means clustering and attention mechanisms, alongside traditional machine learning models including Gaussian mixture models, hidden Markov models, and gradient boosting algorithms. Our experimental framework evaluates multiple encoder architectures, ranging from traditional feature-based approaches to sophisticated deep learning models including Convolutional Neural Networks, Transformers, and pretrained audio models. The proposed prototypical network approaches demonstrate poorer performance than the deep learning baseline, but the k-means prototypical network with query attention shows a more stable result than the k-means prototypical network without the query attention mechanism when K is greater than 2, showing the effectiveness of how query attention is able to stabilize the k prototypes, while the linear encoder baseline approach achieves the best overall classification results. The system successfully distinguishes between different respiratory patterns, enabling non-invasive real-time health monitoring that reduces animal stress while improving disease detection efficiency.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-12-31T16:10:02Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-12-31T16:10:02Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員審定書 i
誌謝 ii
Abstract iii
摘要 iv
Contents v
List of Tables viii
List of Figures ix
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Objectives and Contributions . . . . . . . . . . . . . . . . . 2
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 6
2.1 Sound Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Sound Classification with CNNs . . . . . . . . . . . . . . . . . 6
2.1.2 Large Scale Dataset for Audio Events . . . . . . . . . . . . . . 7
2.1.3 Sound Event Detection . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Animal Sounds Classification . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Pig Sound Classification . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Transfer Learning with Pretrained Models . . . . . . . . . . . 9
2.3 Few-shot Prototypical Network . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Prototypical Networks with Few-Shot Learning . . . . . . . . 10
2.3.2 A Study of Few-Shot Audio Classification . . . . . . . . . . . 10
2.4 Pretrained Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 VGGish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 Whisper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.3 Audio Spectrogram Transformer . . . . . . . . . . . . . . . . . 12
3 Methodology 14
3.1 Modification of Prototypical Network . . . . . . . . . . . . . . . . . . 14
3.1.1 K-means Prototypical Network . . . . . . . . . . . . . . . . . 14
3.1.2 Query Attention . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 K-means Attention Prototypical Network (KAPN) . . . . . . . 17
3.2 Data Preprocessing and Sampling . . . . . . . . . . . . . . . . . . . . 18
3.2.1 Audio Loading, Resampling, and Segmentation . . . . . . . . 19
3.2.2 Data Augmentation for Class Imbalance . . . . . . . . . . . . 19
3.3 Models and Evaluation Framework . . . . . . . . . . . . . . . . . . . 21
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Experiments 23
4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.1 Pig breathing sound dataset . . . . . . . . . . . . . . . . . . . 23
4.1.2 ESC50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 Frame Features for Traditional Models . . . . . . . . . . . . . 26
4.2.2 Spectrogram-based Features for Deep Learning Models . . . . 27
4.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.1 Macro F1 Score . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.2 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . 30
4.5 Experimental Parameter Setting . . . . . . . . . . . . . . . . . . . . . 31
4.5.1 Traditional Machine Learning Models . . . . . . . . . . . . . . 31
4.5.2 Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . 32
4.6 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6.1 Loss Functions for Baseline Models . . . . . . . . . . . . . . . 35
4.6.2 Loss Function for Prototypical Networks . . . . . . . . . . . . 36
4.7 Experiment Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.7.1 Exp.1: Traditional Baseline . . . . . . . . . . . . . . . . . . . 37
4.7.2 Exp.2: Deep learning Baseline . . . . . . . . . . . . . . . . . . 37
4.7.3 General Settings for Prototypical Network Experiments . . . . 39
4.7.4 Exp.3: Encoder with Prototypical Network . . . . . . . . . . . 40
4.7.5 Exp.4: K-means Prototypical Network . . . . . . . . . . . . . 40
4.7.6 Exp.5: Query Attention K-means Prototypical Network . . . . 41
5 Results and Discussions 43
5.1 Exp.1 Traditional Baseline . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 Exp.2: Deep Learning Baseline . . . . . . . . . . . . . . . . . . . . . 44
5.2.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Exp.3: Encoder Prototypical Network . . . . . . . . . . . . . . . . . . 46
5.3.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4 Exp.4: K-means Prototypical Network . . . . . . . . . . . . . . . . . 47
5.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 Exp.5: Query Attention K-means Prototypical Network . . . . . . . . 48
5.6 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 50
6 Conclusions and Future Work 52
6.1 Research Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
References 57
-
dc.language.isoen-
dc.subject豬隻呼吸疾病檢測-
dc.subject聲學事件檢測-
dc.subject原型網路-
dc.subjectk-平均演算法-
dc.subject注意力機制-
dc.subject機器學習-
dc.subject深度學習-
dc.subjectpig respiratory health-
dc.subjectacoustic event detection-
dc.subjectprototypical networks-
dc.subjectk-means clustering-
dc.subjectattention mechanisms-
dc.subjectmachine learning-
dc.subjectdeep learning-
dc.title透過聲紋事件偵測豬隻健康狀況zh_TW
dc.titleDetecting Pig Health Conditions through Acoustic Event Detectionen
dc.typeThesis-
dc.date.schoolyear114-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee王新民;陳冠宇zh_TW
dc.contributor.oralexamcommitteeHsin-Min Wang;Kuan-Yu Chenen
dc.subject.keyword豬隻呼吸疾病檢測,聲學事件檢測原型網路k-平均演算法注意力機制機器學習深度學習zh_TW
dc.subject.keywordpig respiratory health,acoustic event detectionprototypical networksk-means clusteringattention mechanismsmachine learningdeep learningen
dc.relation.page58-
dc.identifier.doi10.6342/NTU202502218-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-12-15-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
dc.date.embargo-lift2026-01-01-
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-114-1.pdf2.6 MBAdobe PDFView/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved