Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92675
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor王鈺強zh_TW
dc.contributor.advisorYu-Chiang Frank Wangen
dc.contributor.author劉亦傑zh_TW
dc.contributor.authorI-Jieh Liuen
dc.date.accessioned2024-06-04T16:06:22Z-
dc.date.available2024-06-05-
dc.date.copyright2024-06-04-
dc.date.issued2024-
dc.date.submitted2024-05-30-
dc.identifier.citation[1] J. Chen, W. Xu, S. Guo, J. Wang, J. Zhang, and H. Wang. Fedtune: A deep dive into efficient federated fine-tuning with pre-trained transformers. arXiv preprint arXiv:2211.08025, 2022.

[2] T. Chen, L. Lin, R. Chen, X. Hui, and H. Wu. Knowledge-guided multi-label few-shot learning for general image recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3):1371–1384, 2020.

[3] T. Chen, M. Xu, X. Hui, H. Wu, and L. Lin. Learning semantic-specific graph representation for multi-label image recognition. In Proceedings of the IEEE/CVF international conference on computer vision, pages 522–531, 2019.

[4] Z.-M. Chen, X.-S. Wei, P. Wang, and Y. Guo. Multi-label image recognition with graph convolutional networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5177–5186, 2019.

[5] Y. J. Cho, J. Wang, and G. Joshi. Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. arXiv preprint arXiv:2010.01243,2020.

[6] G. Cohen, S. Afshar, J. Tapson, and A. Van Schaik. Emnist: Extending mnist to hand-written letters. In 2017 international joint conference on neural networks (IJCNN),pages 2921–2926. IEEE, 2017.

[7] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

[8] M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, and A. Zis-serman. The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111:98–136, 2015.

[9] L. Gao, H. Fu, L. Li, Y. Chen, M. Xu, and C.-Z. Xu. Feddc: Federated learning with non-iid data via local drift decoupling and correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10112–10121, 2022.

[10] T. Guo, S. Guo, J. Wang, X. Tang, and W. Xu. Promptfl: Let federated participants cooperatively learn prompts instead of models-federated learning in age of foundation model. IEEE Transactions on Mobile Computing, 2023.

[11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition.In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

[12] W. Huang, M. Ye, and B. Du. Learn from others and be yourself in heterogeneous federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10143–10153, 2022.

[13] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.

[14] S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh. Scaffold: Stochastic controlled averaging for federated learning. In International Conference on Machine Learning, pages 5132–5143. PMLR, 2020.

[15] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

[16] A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.

[17] J. Lanchantin, T. Wang, V. Ordonez, and Y. Qi. General multi-label image classification with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16478–16488, 2021.

[18] Q. Li, B. He, and D. Song. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10713–10722, 2021.

[19] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2:429–450, 2020.

[20] X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang. On the convergence of fedavg on non-iid data. In International Conference on Learning Representations, 2019.

[21] X. Li, M. Jiang, X. Zhang, M. Kamp, and Q. Dou. Fed{bn}: Federated learning on non-{iid} features via local batch normalization. In International Conference on Learning Representations, 2021.

[22] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.

[23] S. Liu, L. Zhang, X. Yang, H. Su, and J. Zhu. Query2label: A simple transformer way to multi-label classification. arXiv preprint arXiv:2107.10834, 2021.

[24] M. Luo, F. Chen, D. Hu, Y. Zhang, J. Liang, and J. Feng. No fear of heterogeneity: Classifier calibration for federated learning with non-iid data. Advances in Neural Information Processing Systems, 34:5972–5984, 2021.

[25] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.

[26] M. Mendieta, T. Yang, P. Wang, M. Lee, Z. Ding, and C. Chen. Local learning matters: Rethinking data heterogeneity in federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8397–8406, 2022.

[27] J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.

[28] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.

[29] T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. ZelnikManor. Asymmetric loss for multi-label classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 82–91, 2021.

[30] C. Song, F. Granqvist, and K. Talwar. Flair: Federated learning annotated image repository. Advances in Neural Information Processing Systems, 35:37792–37805, 2022.

[31] S. Su, M. Yang, B. Li, and X. Xue. Cross-domain federated adaptive prompt tuning for clip. arXiv preprint arXiv:2211.07864, 2022.

[32] G. Sun, M. Mendieta, T. Yang, and C. Chen. Exploring parameter-efficient finetuning for improving communication efficiency in federated learning. arXiv preprint arXiv:2210.01708, 2022.

[33] Y. Tan, G. Long, L. Liu, T. Zhou, Q. Lu, J. Jiang, and C. Zhang. Fedproto: Federated prototype learning over heterogeneous devices. arXiv preprint arXiv:2105.00243, 3, 2021.

[34] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.

[35] J. Wang, Q. Liu, H. Liang, G. Joshi, and H. V. Poor. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems, 33:7611–7623, 2020.

[36] W. Zhuang, X. Gan, Y. Wen, S. Zhang, and S. Yi. Collaborative unsupervised visual representation learning from decentralized data. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4912–4921, 2021.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92675-
dc.description.abstract聯邦學習(FL)是一種新興的模型學習框架,該方法使多個用戶能夠在不共享私人數據的情況下協作訓練一個強大的模型,以保護隱私。大多數現有的FL方法僅考慮傳統的單標籤圖像分類問題,忽略了將任務轉移到多標籤圖像分類時的影響。然而,在現實世界的FL場景中,FL仍然難以應對用戶在本地數據分佈上的異質性,而在多標籤圖像分類中,這個問題變得更加嚴重。有鑑於變換器模型於中央化學習設定下之成功經驗,我們於是提出了一個新穎的FL框架用於多標籤分類。由於在訓練過程中本地客戶可能觀察到部分標籤之間的相關性,直接聚合本地更新的模型將不能產生滿意的全局模型效能。因此,我們提出了一個新穎的FL框架「語言引導之變換器」(FedLGT)來應對這一具有挑戰性的任務,旨在利用、傳輸不同客戶間的知識以學習一個強大的全局模型。根據我們於各種多標籤數據集(例如 FLAIR,MS-COCO等)進行大量實驗,我們展示了FedLGT能夠達到足夠好的性能,在多標籤FL場景下顯著優於標準FL技術。zh_TW
dc.description.abstractFederated Learning (FL) is an emerging paradigm that enables multiple users to collaboratively train a robust model in a privacy-preserving manner without sharing their private data. Most existing approaches of FL only consider traditional single-label image classification, ignoring the impact when transferring the task to multi-label image classification. Nevertheless, it is still challenging for FL to deal with user heterogeneity in their local data distribution in the real-world FL scenario, and this issue becomes even more severe in multi-label image classification. Inspired by the recent success of Transformers in centralized settings, we propose a novel FL framework for multi-label classification. Since partial label correlation may be observed by local clients during training, direct aggregation of locally updated models would not produce satisfactory performances. Thus, we propose a novel FL framework of Language-Guided Transformer (FedLGT) to tackle this challenging task, which aims to exploit and transfer knowledge across different clients for learning a robust global model. Through extensive experiments on various multi-label datasets (e.g., FLAIR, MS-COCO, etc.), we show that our FedLGT is able to achieve satisfactory performance and outperforms standard FL techniques under multi-label FL scenarios. Code is available at https://github.com/Jack24658735/FedLGT.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-06-04T16:06:22Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-06-04T16:06:22Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsContents

Acknowledgements i
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Works 5
2.1 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Multi-Label Image Classification . . . . . . . . . . . . . . . . . . . 7
2.2.1 Centralized Learning . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 3 Proposed Method 9
3.1 Problem Formulation and Setup . . . . . . . . . . . . . . . . . . . . 9
3.2 A Brief Review for C-tran . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Federated Language-Guided Transformer . . . . . . . . . . . . . . . 12
3.3.1 Client-Aware Masked Label Embedding . . . . . . . . . . . . . . . 12
3.3.2 Universal Label Embedding . . . . . . . . . . . . . . . . . . . . . 14
3.4 Training of FedLGT . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 4 Experiments 18
4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Performance Results and Analysis . . . . . . . . . . . . . . . . . . . 20
4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 5 Conclusion 23
References 24
Appendix A — Introduction 30
A.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
A.2 More Ablation Studies and Experiments . . . . . . . . . . . . . . . . 31
A.2.1 Parameters Analysis for CA-MLE . . . . . . . . . . . . . . . . . . 31
A.2.2 Choices of Label Embedding . . . . . . . . . . . . . . . . . . . . . 31
A.2.3 Discussion of Input Label Embedding . . . . . . . . . . . . . . . . 32
A.2.4 Per-class Results on FLAIR . . . . . . . . . . . . . . . . . . . . . . 33
A.2.5 Different Client Sampling Strategies . . . . . . . . . . . . . . . . . 33
A.2.6 Ablation Study for Fine-grained FLAIR . . . . . . . . . . . . . . . 35
A.2.7 Larger Rounds Results on FLAIR . . . . . . . . . . . . . . . . . . 36
-
dc.language.isoen-
dc.subject視覺與語言zh_TW
dc.subject多標籤分類zh_TW
dc.subject聯邦學習zh_TW
dc.subjectVision and Languageen
dc.subjectFederated Learningen
dc.subjectMulti-label Classificationen
dc.title語言引導之變換器於聯邦式多標籤分類zh_TW
dc.titleLanguage-Guided Transformer for Federated Multi-Label Classificationen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳祝嵩;楊福恩zh_TW
dc.contributor.oralexamcommitteeChu-Song Chen;Fu-En Yangen
dc.subject.keyword聯邦學習,多標籤分類,視覺與語言,zh_TW
dc.subject.keywordFederated Learning,Multi-label Classification,Vision and Language,en
dc.relation.page36-
dc.identifier.doi10.6342/NTU202401012-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-05-30-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf3.61 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved