Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97003
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊永裕zh_TW
dc.contributor.advisorYung-Yu Chuangen
dc.contributor.author黃秉茂zh_TW
dc.contributor.authorPing-Mao Huangen
dc.date.accessioned2025-02-25T16:26:14Z-
dc.date.available2025-09-18-
dc.date.copyright2025-02-25-
dc.date.issued2025-
dc.date.submitted2025-02-13-
dc.identifier.citationS. Asgari Taghanaki, K. Abhishek, J. P. Cohen, J. Cohen-Adad, and G. Hamarneh. Deep semantic segmentation of natural and medical images: a review. Artificial Intelligence Review, 54:137–178, 2021.
V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495, 2017.
G. J. Brostow, J. Fauqueur, and R. Cipolla. Semantic object classes in video: A highdefinition ground truth database. Pattern recognition letters, 30(2):88–97, 2009.
Y. Cao, J. Xu, S. Lin, F. Wei, and H. Hu. Gcnet: Non-local networks meet squeezeexcitation networks and beyond. In Proceedings of the IEEE/CVF international conference on computer vision workshops, pages 0–0, 2019.
L.-C. Chen. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062, 2014.
L.-C. Chen. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587, 2017.
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017.
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018.
Y. Chen, X. Dai, M. Liu, D. Chen, L. Yuan, and Z. Liu. Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11030–11039, 2020.
F. Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017.
X. Ding, X. Zhang, J. Han, and G. Ding. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11963–11975, 2022.
A. Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
M. Fan, S. Lai, J. Huang, X. Wei, Z. Chai, J. Luo, and X. Wei. Rethinking bisenet for real-time semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9716–9725, 2021.
D. Feng, C. Haase-Schütz, L. Rosenbaum, H. Hertlein, C. Glaeser, F. Timm, W. Wiesbeck, and K. Dietmayer. Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Transactions on Intelligent Transportation Systems, 22(3):1341–1360, 2020.
M. Gamal, M. Siam, and M. Abdel-Razek. Shuffleseg: Real-time semantic segmentation network. arXiv preprint arXiv:1803.03816, 2018.
M.-H. Guo, C.-Z. Lu, Q. Hou, Z. Liu, M.-M. Cheng, and S.-M. Hu. Segnext: Rethinking convolutional attention design for semantic segmentation. Advances in Neural Information Processing Systems, 35:1140–1156, 2022.
M.-H. Guo, C.-Z. Lu, Z.-N. Liu, M.-M. Cheng, and S.-M. Hu. Visual attention network. Computational Visual Media, 9(4):733–752, 2023.
D. Haase and M. Amthor. Rethinking depthwise separable convolutions: How intrakernel correlations lead to improved mobilenets. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14600–14609, 2020.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
D. Hendrycks and K. Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
Y. Hong, H. Pan, W. Sun, and Y. Jia. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv preprint arXiv:2101.06085, 2021.
Q. Hou, C.-Z. Lu, M.-M. Cheng, and J. Feng. Conv2former: A simple transformerstyle convnet for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
Y. Hou, Z. Ma, C. Liu, and C. C. Loy. Learning lightweight lane detection cnns by self attention distillation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1013–1021, 2019.
A. G. Howard. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
H. Hu, Z. Zhang, Z. Xie, and S. Lin. Local relation networks for image recognition. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3464–3473, 2019.
J. Hu, L. Shen, S. Albanie, G. Sun, and A. Vedaldi. Gather-excite: Exploiting feature context in convolutional neural networks. Advancesin neural information processing systems, 31, 2018.
J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
S. Kumaar, Y. Lyu, F. Nex, and M. Y. Yang. Cabinet: Efficient context aggregation network for low-latency semantic segmentation. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13517–13524. IEEE, 2021.
K. W. Lau, L.-M. Po, and Y. A. U. Rehman. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Systems with Applications, 236:121352, 2024.
H. Li, P. Xiong, H. Fan, and J. Sun. Dfanet: Deep feature aggregation for real-time semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9522–9531, 2019.
X. Li, W. Wang, X. Hu, and J. Yang. Selective kernel networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 510–519, 2019.
X. Li, A. You, Z. Zhu, H. Zhao, M. Yang, K. Yang, S. Tan, and Y. Tong. Semantic flow for fast and accurate scene parsing. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pages 775–793. Springer, 2020.
X. Li, Y. Zhou, Z. Pan, and J. Feng. Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9145–9153, 2019.
Y. Li, Q. Hou, Z. Zheng, M.-M. Cheng, J. Yang, and X. Li. Large selective kernel network for remote sensing object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16794–16805, 2023.
A. Lin, B. Chen, J. Xu, Z. Zhang, G. Lu, and D. Zhang. Ds-transunet: Dual swin transformer u-net for medical image segmentation. IEEE Transactions on Instrumentation and Measurement, 71:1–15, 2022.
J.-J. Liu, Q. Hou, M.-M. Cheng, C. Wang, and J. Feng. Improving convolutional networks with self-calibrated convolutions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10096–10105, 2020.
S. Liu, T. Chen, X. Chen, X. Chen, Q. Xiao, B. Wu, T. Kärkkäinen, M. Pechenizkiy, D. Mocanu, and Z. Wang. More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity. arXiv preprint arXiv:2207.03620, 2022.
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
Y. Nirkin, L. Wolf, and T. Hassner. Hyperseg: Patch-wise hypernetwork for real-time semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4061–4070, 2021.
M. Orsic, I. Kreso, P. Bevandic, and S. Segvic. In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12607–12616, 2019.
K. O’Shea. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.
J. Park. Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514, 2018.
A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147, 2016.
J. Peng, Y. Liu, S. Tang, Y. Hao, L. Chu, G. Chen, Z. Wu, Z. Chen, Z. Yu, Y. Du, et al. Pp-liteseg: A superior real-time semantic segmentation model. arXiv preprint arXiv:2204.02681, 2022.
E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo. Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1):263–272, 2017.
O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
M. Saha and C. Chakraborty. Her2net: A deep framework for semantic segmentation and classification of cell membranes and nuclei in breast cancer evaluation. IEEE Transactions on Image Processing, 27(5):2189–2200, 2018.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
A. Shrivastava, A. Gupta, and R. Girshick. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 761–769, 2016.
T.-H. Tsai and Y.-W. Tseng. Bisenet v3: Bilateral segmentation network with coordinate attention for real-time semantic segmentation. Neurocomputing, 532:33–42, 2023.
Q. Wan, Z. Huang, J. Lu, Y. Gang, and L. Zhang. Seaformer: Squeeze-enhanced axial transformer for mobile semantic segmentation. In The eleventh international conference on learning representations, 2023.
J. Wang, C. Gou, Q. Wu, H. Feng, J. Han, E. Ding, and J. Wang. Rtformer: Efficient design for real-time semantic segmentation with transformer. Advances in Neural Information Processing Systems, 35:7423–7436, 2022.
J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, et al. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10):3349–3364, 2020.
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF international conference on computer vision, pages 568–578, 2021.
Z. Wang, X. Lin, N. Wu, L. Yu, K.-T. Cheng, and Z. Yan. Dtmformer: Dynamic token merging for boosting transformer-based medical image segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 5814–5822, 2024.
S. Woo, S. Debnath, R. Hu, X. Chen, Z. Liu, I. S. Kweon, and S. Xie. Convnext v2: Co-designing and scaling convnets with masked autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16133–16142, 2023.
S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
H. Wu, B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang. Cvt: Introducing convolutions to vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 22–31, 2021.
E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems, 34:12077–12090, 2021.
S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
J. Xu, Z. Xiong, and S. P. Bhattacharyya. Pidnet: A real-time semantic segmentation network inspired by pid controllers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19529–19539, 2023.
B. Yang, G. Bender, Q. V. Le, and J. Ngiam. Condconv: Conditionally parameterized convolutions for efficient inference. Advances in neural information processing systems, 32, 2019.
M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang. Denseaspp for semantic segmentation in street scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3684–3692, 2018.
C. Yu, C. Gao, J. Wang, G. Yu, C. Shen, and N. Sang. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. International journal of computer vision, 129:3051–3068, 2021.
C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 325–341, 2018.
H. Zhang, C. Wu, Z. Zhang, Y. Zhu, H. Lin, Z. Zhang, Y. Sun, T. He, J. Mueller, R. Manmatha, et al. Resnest: Split-attention networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2736–2746, 2022.
W. Zhang, Z. Huang, G. Luo, T. Chen, X. Wang, W. Liu, G. Yu, and C. Shen. Topformer: Token pyramid transformer for mobile semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12083–12093, 2022.
X. Zhang, X. Zhou, M. Lin, and J. Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6848–6856, 2018.
H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia. Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European conference on computer vision (ECCV), pages 405–420, 2018.
H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017.
S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P. H. Torr, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6881–6890, 2021.
X. Zhu, H. Hu, S. Lin, and J. Dai. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9308–9316, 2019.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97003-
dc.description.abstract即時語義分割的發展面臨挑戰,在於設計高效的卷積神經網路(Convolutional Neural Network, CNN)或減少視覺轉換器(Vision Transformers, ViT)的計算量。儘管視覺轉換器具長距離依賴性優勢,但運算速度受限。即使大核卷積神經網路提供相似感受野,卻難以適應多尺度特徵與整合全局資訊。為了解決這些問題,我們引入了大核注意力機制(Large Kernel Attention, LKA)提出雙邊高效視覺注意力網路(Bilateral Efficient Visual Attention Network, BEVAN)。高效視覺注意力模組(Efficient Visual Attention, EVA)透過稀疏分解大可分離核注意力(Sparse Decomposed Large Separable Kernel Attentions, SDLSKA),結合區域卷積與條狀卷積與多條拓撲來擴展感受野,捕捉多尺度的全局概念及視覺與結構特徵。而全面核篩選模塊(Comprehensive Kernel Selection, CKS)可動態調整感受野,進一步提升效能。深層大核金字塔池化模組(Deep Large Kernel Pyramid Pooling Module, DLKPPM)結合擴張卷積(Dilated Convolution)與大核注意力機制豐富上下文特徵。雙邊架構(Bilateral Architecture)促進分支間的頻繁訊息交流,而邊界引導注意力融合模塊(Boundary Guided Attention Fusion, BGAF)透過邊界自適應地融合低階空間和高階語義,增強識別模糊邊界的能力。我們的模型無需預訓練即達到79.3%的mIoU,展示對大型預訓練數據集的低依賴性。而在ImageNet上預訓練後,mIoU提升至81.0%,在保持32即時幀率的同時,刷新了語義分割的標準。zh_TW
dc.description.abstractThe development of real-time semantic segmentation faces significant challenges in designing efficient convolutional neural network (CNN) architectures or minimizing the computational costs of vision transformers (ViTs) while maintaining real-time performance. Although ViTs excel at capturing long-range dependencies, their computational speed is often a bottleneck. Large-kernel CNNs offer similar receptive fields but struggle with multi-scale feature adaptation and global context integration. To overcome these limitations, we introduce the Large Kernel Attention mechanism. Our proposed Bilateral Efficient Visual Attention Network (BEVAN) integrates the Efficient Visual Attention (EVA) module, Deep Large Kernel Pyramid Pooling Module (DLKPPM), and Boundary Guided Attention Fusion (BGAF) module. The EVA models expands the receptive field to capture multi-scale contextual information and extracts visual and structural features using Sparse Decomposed Large Separable Kernel Attentions (SDLSKA) by combining regional and strip convolutions with diverse topological structures. The Comprehensive Kernel Selection (CKS) mechanism dynamically adapts the receptive field to further enhance performance. The Deep Large Kernel Pyramid Pooling Module (DLKPPM) enriches contextual features and extends the receptive field through a combination of dilated convolution and large kernel attention mechanisms, balancing performance and accuracy by refining features and improving semantic concept capture. The bilateral architecture facilitates frequent communication between branches, and the BGAF module uses the guidance of boundary information to adaptively merge low-level spatial features with high-level semantic features, enhancing the network's ability to accurately delineate blurred boundaries while retaining detailed contours and semantic context. Our model achieves a 79.3% mIoU without pretraining, indicating a low dependency on extensive pretraining datasets. After pretraining on ImageNet, the model further attains an 81.0% mIoU, setting a new state-of-the-art benchmark while maintaining real-time efficiency with a processing rate of 32 FPS.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-25T16:26:14Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-02-25T16:26:14Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xiii
List of Tables xv
Denotation xvii
Chapter 1 Introduction 1
Chapter 2 Related Work 7
2.1 Generic Semantic Segmentation 7
2.2 Real-time Semantic Segmentation 8
2.3 Large Kernel Attention 10
2.4 Feature Fusion 12
2.5 Pyramid Pooling Module 13
Chapter 3 Methodology 15
3.1 Bilateral Architecture 15
3.2 Efficient Visual Attention Block 16
3.2.1 Sparse Decompose Large Separable Kernel Attentions 18
3.2.2 Comprehensive Kernel Selection 19
3.3 Deep Large Kernel Pyramid Pooling Module 21
3.4 Boundary Guided Adaptive Fusion 23
Chapter 4 Experiments 25
4.1 Dataset 25
4.1.1 Cityscapes 25
4.1.2 Camvid 25
4.2 Experiment Settings 26
4.2.1 Pretraining 26
4.2.2 Training 26
4.2.3 Measurement 27
4.3 Comparison 27
4.3.1 Comparison without pretraining 27
4.3.2 Overall Comparison 28
4.4 Ablation Study 29
4.4.1 Architecture Efficiency 29
4.4.2 Large Kernel Attention 30
4.4.3 Selection Kernel 31
4.4.4 Branch Fusion 32
4.4.5 Multi-scale Fusion 32
4.4.6 Overall without pretraining 33
4.5 Visualization 34
4.5.1 Small Object 34
4.5.2 Completeness 36
Chapter 5 Conclusion 37
References 39
-
dc.language.isoen-
dc.subject自適應特徵融合zh_TW
dc.subject大核注意力zh_TW
dc.subject即時語義分割zh_TW
dc.subject電腦視覺zh_TW
dc.subjectAdaptive Feature Fusionen
dc.subjectComputer Visionen
dc.subjectReal-time Semantic Segmentationen
dc.subjectLarge Kernel Attentionen
dc.titleBEVANet: 雙分支高效視覺注意力網路於即時語義分割zh_TW
dc.titleBEVANet: Bilateral Efficient Visual Attention Network for Real-time Semantic Segmentationen
dc.typeThesis-
dc.date.schoolyear113-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee葉正聖;吳賦哲zh_TW
dc.contributor.oralexamcommitteeJeng-Sheng Yeh;Fu-Che Wuen
dc.subject.keyword電腦視覺,即時語義分割,大核注意力,自適應特徵融合,zh_TW
dc.subject.keywordComputer Vision,Real-time Semantic Segmentation,Large Kernel Attention,Adaptive Feature Fusion,en
dc.relation.page49-
dc.identifier.doi10.6342/NTU202500107-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-02-14-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-lift2025-09-18-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-113-1.pdf1.84 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved