用於多任務邊緣-雲端部署的架構與切割點聯合搜尋方法

張祐綸; Yu-Lun Chang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99169

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	簡韶逸	zh_TW
dc.contributor.advisor	Shao-Yi Chien	en
dc.contributor.author	張祐綸	zh_TW
dc.contributor.author	Yu-Lun Chang	en
dc.date.accessioned	2025-08-21T16:39:35Z	-
dc.date.available	2025-08-22	-
dc.date.copyright	2025-08-21	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-01	-
dc.identifier.citation	[1] Y. Lu, A. Kumar, S. Zhai, Y. Cheng, T. Javidi, and R. Feris, “Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5334–5343. xiii, 10 [2] S. Vandenhende, S. Georgoulis, B. De Brabandere, and L. Van Gool,“Branched multi-task networks: deciding what layers to share,” arXiv preprint arXiv:1904.02920, 2019. xiii, 2, 10, 11 [3] D. Bruggemann, M. Kanakis, S. Georgoulis, and L. Van Gool, “Automated search for resource-efficient branched multi-task networks,” arXiv preprint arXiv:2008.10292, 2020. xiii, 2, 10, 11, 14, 29, 39, 45, 46, 47, 48, 53 [4] C. White, M. Safari, R. Sukthanker, B. Ru, T. Elsken, A. Zela, D. Dey, and F. Hutter, “Neural architecture search: Insights from 1000 papers,” arXiv preprint arXiv:2301.08727, 2023. xiii, 12, 15 [5] H. Liu, K. Simonyan, and Y. Yang, “Darts: Differentiable architecture search,” arXiv preprint arXiv:1806.09055, 2018. xiii, 13 [6] Y. Gao, H. Bai, Z. Jie, J. Ma, K. Jia, and W. Liu, “Mtl-nas: Task-agnostic neural architecture search towards general-purpose multi-task learning,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2020, pp. 11 543–11 552. xiii, 13, 14 [7] C. Hu, W. Bao, D. Wang, and F. Liu, “Dynamic adaptive dnn surgery for inference acceleration on the edge,” in IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 1423–1431. xiii, 4, 15, 16, 46, 47, 48 [8] A. Banitalebi-Dehkordi, N. Vedula, J. Pei, F. Xia, L. Wang, and Y. Zhang,“Auto-split: A general framework of collaborative edge-cloud ai,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2543–2553. xiii, 4, 15, 16 [9] C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecture search,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 19–34. xiii, 17 [10] H. T. Pham and S. Han, “Natural language processing with multitask classification for semantic prediction of risk-handling actions in construction contracts,” Journal of computing in civil engineering, vol. 37, no. 6, p. 04023027, 2023. 1 [11] F. Tao and C. Busso, “End-to-end audiovisual speech recognition system with multitask learning,” IEEE Transactions on Multimedia, vol. 23, pp. 1–11, 2020. 1, 9 [12] Y. Yang, P.-T. Jiang, Q. Hou, H. Zhang, J. Chen, and B. Li, “Multi-task dense prediction via mixture of low-rank experts,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 27 927–27 937. 1 [13] B. Lin, W. Jiang, P. Chen, Y. Zhang, S. Liu, and Y.-C. Chen, “Mtmamba: Enhancing multi-task dense scene understanding by mamba-based decoders,” in European Conference on Computer Vision. Springer, 2024, pp. 314–330. 1 [14] D.-G. Lee, “Fast drivable areas estimation with multi-task learning for realtime autonomous driving assistant,” Applied Sciences, vol. 11, no. 22, p. 10713, 2021. 1 [15] K. Ishihara, A. Kanervisto, J. Miura, and V. Hautamaki, “Multi-task learning with attention for end-to-end autonomous driving,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 2902–2911. 1 [16] M. Shridhar, L. Manuelli, and D. Fox, “Perceiver-actor: A multi-task transformer for robotic manipulation,” in Conference on Robot Learning. PMLR, 2023, pp. 785–799. 1 [17] D. Xu, W. Ouyang, X. Wang, and N. Sebe, “Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 675–684. 2 [18] M. Teichmann, M. Weber, M. Zoellner, R. Cipolla, and R. Urtasun, “Multinet: Real-time joint semantic reasoning for autonomous driving,” in 2018 IEEE intelligent vehicles symposium (IV). IEEE, 2018, pp. 1013–1020. 2 [19] S. Liu, E. Johns, and A. J. Davison, “End-to-end multi-task learning with attention,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 1871–1880. 2 [20] B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, “Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 10 734–10 742. 2 [21] P. Guo, C.-Y. Lee, and D. Ulbricht, “Learning to branch for multi-task learning,” in International conference on machine learning. PMLR, 2020, pp. 3854–3863. 2 [22] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision and challenges,” IEEE internet of things journal, vol. 3, no. 5, pp. 637–646, 2016. 3 [23] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” arXiv preprint arXiv:1510.00149, 2015. 3 [24] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015. 3, 17 [25] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,” ACM SIGARCH Computer Architecture News, vol. 45, no. 1, pp. 615–629, 2017. 4, 15 [26] S. Vandenhende, S. Georgoulis, W. Van Gansbeke, M. Proesmans, D. Dai, and L. Van Gool, “Multi-task learning for dense prediction tasks: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3614–3633, 2021. 4 [27] R. Tobiasz, G. Wilczynski, P. Graszka, N. Czechowski, and S. ´ Łuczak, “Edge devices inference performance comparison,” arXiv preprint arXiv:2306.12093, 2023. 4 [28] I. Misra, A. Shrivastava, A. Gupta, and M. Hebert, “Cross-stitch networks for multi-task learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3994–4003. 9 [29] S. Ruder, J. Bingel, I. Augenstein, and A. Søgaard, “Latent multi-task architecture learning,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4822–4829. 9 [30] Y. Gao, J. Ma, M. Zhao, W. Liu, and A. L. Yuille, “Nddr-cnn: Layerwise feature fusing in multi-task cnns by neural discriminative dimensionality reduction,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3205–3214. 9 [31] I. Kokkinos, “Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6129–6138. 9 [32] Z. Chen, V. Badrinarayanan, C.-Y. Lee, and A. Rabinovich, “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,” in International conference on machine learning. PMLR, 2018, pp. 794–803. 9 [33] A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7482–7491. 9 [34] O. Sener and V. Koltun, “Multi-task learning as multi-objective optimization,” Advances in neural information processing systems, vol. 31, 2018. 9 [35] B. Zoph and Q. V. Le, “Neural architecture search with reinforcement learning,” arXiv preprint arXiv:1611.01578, 2016. 12 [36] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697–8710. 12 [37] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, “Large-scale evolution of image classifiers,” in International conference on machine learning. PMLR, 2017, pp. 2902–2911. 12 [38] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolution for image classifier architecture search,” in Proceedings of the aaai conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4780–4789. 12 [39] S. Zhang, Y. Li, X. Liu, S. Guo, W. Wang, J. Wang, B. Ding, and D. Wu, “Towards real-time cooperative deep inference over the cloud and edge end devices,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 4, no. 2, pp. 1–24, 2020. 15 [40] E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbelsoftmax,” arXiv preprint arXiv:1611.01144, 2016. 25 [41] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7794–7803. 30 [42] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2704–2713. 31 [43] Y. Bengio, N. Leonard, and A. Courville, “Estimating or propagating gradi- ´ ents through stochastic neurons for conditional computation,” arXiv preprint arXiv:1308.3432, 2013. 32 [44] D. A. Huffman, “A method for the construction of minimum-redundancy codes,” Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952. 32 [45] X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun, and A. Yuille, “Detect what you can: Detecting and representing objects using holistic models and body parts,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1971–1978. 37, 38 [46] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12. Springer, 2012, pp. 746–760. 37, 38 [47] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. 38 [48] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510– 4520. 38 [49] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017. 38 [50] D. R. Martin, C. C. Fowlkes, and J. Malik, “Learning to detect natural image boundaries using local brightness, color, and texture cues,” IEEE transactions on pattern analysis and machine intelligence, vol. 26, no. 5, pp. 530–549, 2004. 40 [51] K.-K. Maninis, I. Radosavovic, and I. Kokkinos, “Attentive single-tasking of multiple tasks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 1851–1860. 40	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99169	-
dc.description.abstract	為了滿足邊緣裝置上即時且低延遲執行電腦視覺任務的需求，如何高效部署多任務學習（MTL）模型於邊雲協作架構中，成為當前的重要挑戰。由於邊緣裝置的運算資源有限，若僅在邊緣端執行複雜的多任務模型，或完全依賴雲端推論，往往難以在準確率與延遲之間取得理想的平衡。為此，本文提出 SplitNAS，一個統一式架構搜尋框架，可聯合探索任務專屬的分支結構與模型切割點，以支援協同式的邊雲部署。 SplitNAS 採用可微分的神經架構搜尋（Neural Architecture Search）流程，聯合優化任務分支結構與模型切割位置。為了平衡預測效能與系統延遲，SplitNAS 設計了延遲感知的損失函數，綜合考量邊緣運算、雲端推論與資料傳輸的成本。在此基礎上，模型切割點進一步結合 autoencoder 壓縮模組以降低中間特徵的傳輸負擔，並應用知識蒸餾（KD）技術以提升模型準確率。在 PASCAL-Context 與 NYUD-v2 兩個資料集上的實驗結果顯示，SplitNAS能在不同頻寬與硬體條件下取得優異的準確率與延遲權衡。與純邊緣端或純雲端推論相比，SplitNAS 在 MobileNetV2 與 ResNet34 架構下皆可減少超過 50% 的總體延遲，展現其於實際應用場景中的效能與實用價值。	zh_TW
dc.description.abstract	To meet the demand for real-time and low-latency execution of computer vision tasks on edge devices, efficient deployment of multi-task learning (MTL) models in edge-cloud collaborative settings has become a critical challenge. Due to the limited computational capacity of edge devices, deploying complex multi-task models solely on the edge or fully on the cloud often results in suboptimal tradeoffs between accuracy and latency. To address this, we propose SplitNAS, a unified framework that jointly searches for task-specific branching architectures and optimal partition points for collaborative edge-cloud deployment. SplitNAS adopts a differentiable neural architecture search (NAS) process to jointly optimize task-specific branching structures and model partition points. To balance prediction performance and system latency, it integrates latency-aware loss functions that consider both edge and cloud execution costs, as well as transmission overhead. On top of this, an autoencoder-based compression module is introduced at the partition point to reduce feature transmission cost, and knowledge distillation (KD) is applied to improve the accuracy of the model. Experimental results on the PASCAL-Context and NYUD-v2 datasets demonstrate that SplitNAS achieves superior accuracy-latency trade-offs under various bandwidth and hardware conditions. Compared to pure edge or cloud inference, SplitNAS reduces total latency by over 50% on both MobileNetV2 and ResNet34 backbones, highlighting its effectiveness and practical value for real-world deployment.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T16:39:34Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-21T16:39:35Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Master’s Thesis Acceptance Certificate i Acknowledgement iii Chinese Abstract v Abstract vii Contents ix List of Figures xiii List of Tables xv 1 Introduction 1 1.1 Multi-task Learning 1 1.2 Edge-cloud Collaboration 2 1.3 Challenges 4 1.4 Contribution 5 1.5 Thesis Organization 7 2 Related Work 9 2.1 Branched Multi-task Learning Network 9 2.2 Neural Architecture Search 12 2.2.1 Foundations of NAS 12 2.2.2 Differentiable Neural Architecture Search 12 2.2.3 Multi-task Learning in NAS 13 2.3 Edge-Cloud Collaboration 14 2.4 Knowledge Distillation 17 3 Proposed Method 19 3.1 Overview of the Framework 20 3.1.1 Problem Formulation 20 3.1.2 System Pipeline Overview 20 3.2 Stage I : Latency Profiling 21 3.3 Stage II: Latency-aware Architecture Search 23 3.3.1 Search space 23 3.3.2 Search Algorithm 25 3.3.3 Objective Function 28 3.4 Stage III: Retraining with Compression and Knowledge Distillation 30 3.4.1 Attention-based Unbalanced Autoencoder 30 3.4.2 Quantization and Entropy Coding 31 3.4.3 Knowledge Distillation 32 4 Experiments 37 4.1 Datasets 37 4.2 Implementation Details 38 4.3 Performance Evaluation 39 4.3.1 Evaluation Metrics 39 4.3.2 Baselines 40 4.3.3 Effectiveness of Proposed Method 41 4.3.4 Performance Under Different Latency Loss Weights 45 4.4 Performance under Different Transmission Speeds 52 4.5 Ablation Study 52 4.5.1 Effectiveness of Reconstruction and KD Losses 54 4.5.2 Compression Effectiveness of Each Component 54 5 Conclusion 57 Reference 59	-
dc.language.iso	en	-
dc.subject	神經架構搜尋	zh_TW
dc.subject	邊雲協作	zh_TW
dc.subject	多任務學習	zh_TW
dc.subject	延遲感知	zh_TW
dc.subject	知識蒸餾	zh_TW
dc.subject	edge-cloud collaboration	en
dc.subject	multi-task learning	en
dc.subject	Neural Architecture Search	en
dc.subject	latency-aware loss	en
dc.title	用於多任務邊緣-雲端部署的架構與切割點聯合搜尋方法	zh_TW
dc.title	SplitNAS: Joint Architecture and Partition Search for Multi-Task Edge-Cloud Deployment	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳駿丞;陳永耀;盧奕璋	zh_TW
dc.contributor.oralexamcommittee	Jun-Cheng Chen;Yung-Yao Chen;Yi-Chang Lu	en
dc.subject.keyword	神經架構搜尋,邊雲協作,多任務學習,延遲感知,知識蒸餾,	zh_TW
dc.subject.keyword	Neural Architecture Search,edge-cloud collaboration,multi-task learning,latency-aware loss,	en
dc.relation.page	65	-
dc.identifier.doi	10.6342/NTU202502858	-
dc.rights.note	未授權	-
dc.date.accepted	2025-08-04	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電子工程學研究所	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	4.04 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。