深度學習於視覺之異常偵測

吳志強; Jhih-Ciang Wu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87952

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	劉庭祿	zh_TW
dc.contributor.advisor	Tyng-Luh Liu	en
dc.contributor.author	吳志強	zh_TW
dc.contributor.author	Jhih-Ciang Wu	en
dc.date.accessioned	2023-07-31T16:30:32Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-07-31	-
dc.date.issued	2023	-
dc.date.submitted	2023-07-02	-
dc.identifier.citation	[1] D. Abati, A. Porrello, S. Calderara, and R. Cucchiara. Latent space autoregression for novelty detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 481–490, 2019. [2] S. Akcay, A. Atapour-Abarghouei, and T. P. Breckon. Ganomaly: Semi-supervised anomaly detection via adversarial training. In Asian conference on computer vision, pages 622–637. Springer, 2018. [3] S. Akçay, A. Atapour-Abarghouei, and T. P. Breckon. Skip-ganomaly: Skip con- nected and adversarially trained encoder-decoder anomaly detection. In 2019 Inter- national Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2019. [4] M. Amer, M. Goldstein, and S. Abdennadher. Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD workshop on outlier detection and description, pages 8–15, 2013. [5] H. B. Barlow. Single units and sensation: a neuron doctrine for perceptual psychol- ogy? Perception, 1(4):371–394, 1972. [6] H. B. Barlow. Unsupervised learning. Neural computation, 1(3):295–311, 1989. 91 [7] L. Bergman and Y. Hoshen. Classification-based anomaly detection for general data. In International conference on learning representations, 2020. [8] P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger. Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pages 9592–9600, 2019. [9] P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, pages 4183–4192, 2020. [10] P. Bergmann, S. Löwe, M. Fauser, D. Sattlegger, and C. Steger. Improving unsu- pervised defect segmentation by applying structural similarity to autoencoders. In VISIGRAPP (5: VISAPP), 2019. [11] C. M. Bishop. Pattern recognition and machine learning. springer, 2006. [12] H. Borgli, V. Thambawita, P. H. Smedsrud, S. Hicks, D. Jha, S. L. Eskeland, K. R. Randel, K. Pogorelov, M. Lux, D. T. D. Nguyen, et al. Hyperkvasir, a comprehen- sive multi-class image and video dataset for gastrointestinal endoscopy. Scientific data, 7(1):283, 2020. [13] W.Bulten,G.Litjens,H.Pinckaers,P.Ström,M.Eklund,K.Kartasalo,M.Demkin, and S. Dane. The PANDA challenge: Prostate cANcer graDe Assessment using the Gleason grading system, Mar. 2020. [14] P. Burlina, N. Joshi, and I.-J. Wang. Where’s wally now? deep generative and discriminative embeddings for novelty detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019. [15] R.Cai,H.Zhang,W.Liu,S.Gao,andZ.Hao.Appearance-motionmemoryconsis- tency network for video anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 938–946, 2021. [16] J. Carreira and A. Zisserman. Quo vadis, action recognition? A new model and the kinetics dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4724–4733, 2017. [17] R. Chalapathy and S. Chawla. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407, 2019. [18] V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3):1–58, 2009. [19] S. Chang, Y. Li, J. S. Shen, J. Feng, and Z. Zhou. Contrastive attention for video anomaly detection. IEEE Transactions on Multimedia, 2021. [20] C. Chen, Y. Xie, S. Lin, A. Yao, G. Jiang, W. Zhang, Y. Qu, R. Qiao, B. Ren, and L. Ma. Comprehensive regularization in a bi-directional predictive network for video anomaly detection. In Proceedings of the American association for artificial intelligence, pages 1–9, 2022. [21] D.-J. Chen, J.-T. Chien, H.-T. Chen, and T.-L. Liu. Unsupervised meta-learning of figure-ground segmentation via imitating visual effects. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 8159–8166, 2019. [22] M.-M. Cheng, N. J. Mitra, X. Huang, P. H. Torr, and S.-M. Hu. Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence, 37(3):569–582, 2014. [23] Y. Cong, J. Yuan, and J. Liu. Sparse reconstruction cost for abnormal event de- tection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3449–3456, 2011. [24] M. Cornia, M. Stefanini, L. Baraldi, and R. Cucchiara. Meshed-memory trans- former for image captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 10578–10587, 2020. [25] J. Deng, J. Guo, N. Xue, and S. Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4690–4699, 2019. [26] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo. A geometric framework for unsupervised anomaly detection. In Applications of data mining in computer security, pages 77–101. Springer, 2002. [27] C. Feichtenhofer, A. Pinz, and A. Zisserman. Convolutional two-stream network fusion for video action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1933–1941, 2016. [28] J.-C. Feng, F.-T. Hong, and W.-S. Zheng. Mist: Multiple instance self-training framework for video anomaly detection. In Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition, pages 14009–14018, 2021. [29] C.Finn,P.Abbeel,andS.Levine.Model-agnosticmeta-learningforfastadaptation of deep networks. In International conference on machine learning, pages 1126– 1135. PMLR, 2017. [30] C. Finn, A. Rajeswaran, S. Kakade, and S. Levine. Online meta-learning. In Inter- national Conference on Machine Learning, pages 1920–1930. PMLR, 2019. [31] M.-I.Georgescu,A.Barbalau,R.T.Ionescu,F.S.Khan,M.Popescu,andM.Shah. Anomaly detection in video via self-supervised and multi-task learning. In Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pages 12742–12752, 2021. [32] I.GolanandR.El-Yaniv.Deepanomalydetectionusinggeometrictransformations. Advances in neural information processing systems, 31, 2018. [33] D.Gong,L.Liu,V.Le,B.Saha,M.R.Mansour,S.Venkatesh,andA.v.d.Hengel. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1705–1714, 2019. [34] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neu- ral information processing systems, volume 27, 2014. [35] R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, volume 2, pages 1735–1742. IEEE, 2006. [36] S.H.Hanzaei,A.Afshar,andF.Barazandeh.Automaticdetectionandclassification of the ceramic tiles’surface defects. Pattern Recognition, 66:174–189, 2017. [37] M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, and L. S. Davis. Learning temporal regularity in video sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 733–742, 2016. [38] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [39] J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018. [40] C.Huang,H.Guan,A.Jiang,Y.Zhang,M.Spratling,andY.-F.Wang.Registration based few-shot anomaly detection. arXiv preprint arXiv:2207.07361, 2022. [41] Y.Huang,C.Qiu,andK.Yuan.Surfacedefectsaliencyofmagnetictile.TheVisual Computer, 36(1):85–96, 2020. [42] R. T. Ionescu, F. S. Khan, M.-I. Georgescu, and L. Shao. Object-centric auto- encoders and dummy anomalies for abnormal event detection in video. In Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pages 7842–7851, 2019. [43] B. Jiang, L. Zhang, H. Lu, C. Yang, and M.-H. Yang. Saliency detection via ab- sorbing markov chain. In Proceedings of the IEEE international conference on computer vision, pages 1665–1672, 2013. [44] T.Karras,S.Laine,andT.Aila.Astyle-basedgeneratorarchitectureforgenerative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4401–4410, 2019. [45] W.Kay,J.Carreira,K.Simonyan,B.Zhang,C.Hillier,S.Vijayanarasimhan,F.Vi- ola, T. Green, T. Back, P. Natsev, et al. The kinetics human action video dataset. arXiv preprint arXiv:1705.06950, 2017. [46] S. Kazeminia, A. Sadafi, A. Makhro, A. Bogdanova, S. Albarqouni, and C. Marr. Anomaly-aware multiple instance learning for rare anemia disorder classification. In MICCAI, pages 341–350. Springer, 2022. [47] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [48] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. [49] K. Kreutz-Delgado, J. F. Murray, B. D. Rao, K. Engan, T.-W. Lee, and T. J. Se- jnowski. Dictionary learning algorithms for sparse representation. Neural compu- tation, 15(2):349–396, 2003. [50] A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009. [51] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. [52] Y. LeCun, C. Cortes, and C. Burges. Mnist handwritten digit database, 1998. URL http://www. research. att. com/~ yann/ocr/mnist, 1998. [53] C.-L. Li, K. Sohn, J. Yoon, and T. Pfister. Cutpaste: Self-supervised learning for anomaly detection and localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9664–9674, 2021. [54] K.-L. Li, H.-K. Huang, S.-F. Tian, and W. Xu. Improving one-class svm for anomaly detection. In Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE Cat. No. 03EX693), volume 5, pages 3077–3081. IEEE, 2003. [55] L. Li, M. Xu, X. Wang, L. Jiang, and H. Liu. Attention based glaucoma detection: a large-scale database and cnn model. In CVPR, pages 10571–10580, 2019. [56] S. Li, F. Liu, and L. Jiao. Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection. In AAAI, 2022. [57] X. Li, H. Lu, L. Zhang, X. Ruan, and M.-H. Yang. Saliency detection via dense and sparse reconstruction. In Proceedings of the IEEE international conference on computer vision, pages 2976–2983, 2013. [58] J. Liang, N. Homayounfar, W.-C. Ma, Y. Xiong, R. Hu, and R. Urtasun. Polytrans- form: Deep polygon transformer for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9131–9140, 2020. [59] W. Liu, W. Luo, D. Lian, and S. Gao. Future frame prediction for anomaly detection–a new baseline. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6536–6545, 2018. [60] Z. Liu, Y. Nie, C. Long, Q. Zhang, and G. Li. A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame pre- diction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13588–13597, 2021. [61] C. Louizos, M. Welling, and D. P. Kingma. Learning sparse neural networks through l_0 regularization. In International conference on learning representa- tions, 2018. [62] C. Lu, J. Shi, and J. Jia. Abnormal event detection at 150 fps in matlab. In Pro- ceedings of the IEEE/CVF International Conference on Computer Vision, pages 2720–2727, 2013. [63] W. Luo, W. Liu, and S. Gao. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 341–349, 2017. [64] H. Lv, C. Zhou, Z. Cui, C. Xu, Y. Li, and J. Yang. Localizing anomalies from weakly-labeled videos. IEEE transactions on image processing, 30:4505–4515, 2021. [65] X.-Y.Ma,S.-Q.Liu,X.-L.Xie,X.-H.Zhou,Z.-G.Hou,Y.-J.Zhou,M.Song,L.-S. Zhang, and C.-N. Wang. Dsp-net: Deeply-supervised pseudo-siamese network for dynamic angiographic image matching. In MICCAI, pages 44–53. Springer, 2022. [66] Y. Ma, X. Chen, K. Cheng, Y. Li, and B. Sun. Ldpolypvideo benchmark: a large- scale colonoscopy video dataset of diverse polyps. In MICCAI, pages 387–396. Springer, 2021. [67] D. Maclaurin, D. Duvenaud, and R. Adams. Gradient-based hyperparameter op- timization through reversible learning. In International conference on machine learning, pages 2113–2122. PMLR, 2015. [68] F. Maiorano and A. Petrosino. Granular trajectory based anomaly detection for surveillance. In International Conference on Pattern Recognition, pages 2066– 2072. IEEE, 2016. [69] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of the 26th annual international conference on machine learning, pages 689–696, 2009. [70] T. S. Nazare, R. F. de Mello, and M. A. Ponti. Are pre-trained cnns good fea- ture extractors for anomaly detection in surveillance videos? arXiv preprint arXiv:1811.08495, 2018. [71] G. Pang, C. Yan, C. Shen, A. v. d. Hengel, and X. Bai. Self-trained deep ordinal regression for end-to-end video anomaly detection. In Proceedings of the IEEE/ CVF International Conference on Computer Vision, pages 12173–12182, 2020. [72] H. Park, J. Noh, and B. Ham. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 14372–14381, 2020. [73] P. Perera, R. Nallapati, and B. Xiang. Ocgan: One-class novelty detection using gans with constrained latent representations. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 2898–2906, 2019. [74] S. Pidhorskyi, R. Almohsen, and G. Doretto. Generative probabilistic novelty de- tection with adversarial autoencoders. In Advances in neural information process- ing systems, pages 6822–6833, 2018. [75] Z. Qiu, T. Yao, and T. Mei. Learning spatio-temporal representation with pseudo- 3d residual networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5534–5542, 2017. [76] D. Rezende and S. Mohamed. Variational inference with normalizing flows. In International conference on machine learning, pages 1530–1538. PMLR, 2015. [77] L. S. F. Ribeiro, T. Bui, J. Collomosse, and M. Ponti. Sketchformer: Transformer- based representation for sketched structure. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 14153–14162, 2020. [78] M. Rudolph, B. Wandt, and B. Rosenhahn. Same same but differnet: Semi- supervised defect detection with normalizing flows. In Proceedings of the IEEE/ CVF winter conference on applications of computer vision, Jan. 2021. [79] L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, and M. Kloft. Deep one-class classification. In International confer- ence on machine learning, pages 4393–4402. PMLR, 2018. [80] M. Sabokrou, M. Khalooei, M. Fathy, and E. Adeli. Adversarially learned one- class classifier for novelty detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3379–3388, 2018. [81] D. J. Samuel and F. Cuzzolin. Svd-gan for real-time unsupervised video anomaly detection. In British Machine Vision Conference, 2021. [82] T. Schlegl, P. Seeböck, S. M. Waldstein, U. Schmidt-Erfurth, and G. Langs. Unsu- pervised anomaly detection with generative adversarial networks to guide marker discovery. In International conference on information processing in medical imag- ing, pages 146–157. Springer, 2017. [83] B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the support of a high-dimensional distribution. Neural computation, 13(7):1443–1471, 2001. [84] B. Schölkopf, R. C. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt. Support vector method for novelty detection. In Advances in neural information processing systems, volume 12, 1999. [85] T.R.Shaham,T.Dekel,andT.Michaeli.Singan:Learningagenerativemodelfrom a single natural image. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4570–4580, 2019. [86] W. Shao, I. Bhattacharya, S. J. Soerensen, C. A. Kunder, J. B. Wang, R. E. Fan, P. Ghanouni, J. D. Brooks, G. A. Sonn, and M. Rusu. Weakly supervised registration of prostate mri and histopathology images. In MICCAI, pages 98–107. Springer, 2021. [87] K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, vol- ume 27, 2014. [88] F. Sohrab, J. Raitoharju, M. Gabbouj, and A. Iosifidis. Subspace support vector data description. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 722–727. IEEE, 2018. [89] W. Sultani, C. Chen, and M. Shah. Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6479–6488, 2018. [90] C. Sun, Y. Jia, Y. Hu, and Y. Wu. Scene-aware context reasoning for unsupervised abnormal event detection in videos. In Proceedings of the 28th ACM International Conference on Multimedia, pages 184–192, 2020. [91] C.-W. Ten, J. Hong, and C.-C. Liu. Anomaly detection for cybersecurity of the substations. IEEE Transactions on Smart Grid, 2(4):865–873, 2011. [92] Y. Tian, G. Pang, Y. Chen, R. Singh, J. W. Verjans, and G. Carneiro. Weakly- supervised video anomaly detection with robust temporal feature magnitude learn- ing. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 4975–4986, 2021. [93] Y. Tian, G. Pang, F. Liu, Y. Liu, C. Wang, Y. Chen, J. Verjans, and G. Carneiro. Contrastive transformer-based multiple instance learning for weakly supervised polyp frame detection. In MICCAI, pages 88–98. Springer, 2022. [94] D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spa- tiotemporal features with 3d convolutional networks. In Proceedings of the IEEE/ CVF International Conference on Computer Vision, pages 4489–4497, 2015. [95] L. Tran, X. Yin, and X. Liu. Disentangled representation learning gan for pose- invariant face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1415–1424, 2017. [96] A.VandenOord,N.Kalchbrenner,L.Espeholt,O.Vinyals,A.Graves,etal.Condi- tional image generation with pixelcnn decoders. In Advances in neural information processing systems, pages 4790–4798, 2016. [97] A.Vaswani,N.Shazeer,N.Parmar,J.Uszkoreit,L.Jones,A.N.Gomez,Ł.Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information pro- cessing systems, 30, 2017. [98] S. Venkataramanan, K.-C. Peng, R. V. Singh, and A. Mahalanobis. Attention guided anomaly localization in images. In European Conference on Computer Vi- sion, pages 485–503. Springer, 2020. [99] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and com- posing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008. [100] B. Wan, Y. Fang, X. Xia, and J. Mei. Weakly supervised video anomaly detection via center-guided discriminative learning. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2020. [101] C. Wang, Y.-M. Zhang, and C.-L. Liu. Anomaly detection via minimum likelihood generative adversarial networks. In International Conference on Pattern Recogni- tion, pages 1121–1126. IEEE, 2018. [102] J. Wang and A. Cherian. Gods: Generalized one-class discriminative subspaces for anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8201–8211, 2019. [103] L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. V. Gool. Tem- poral segment networks: Towards good practices for deep action recognition. In European conference on computer vision, pages 20–36. Springer, 2016. [104] S. Wang, E. Zhu, J. Yin, and F. Porikli. Anomaly detection in crowded scenes by sl-hof descriptor and foreground classification. In International Conference on Pattern Recognition, pages 3398–3403. IEEE, 2016. [105] X. Wang, Z. Che, B. Jiang, N. Xiao, K. Yang, J. Tang, J. Ye, J. Wang, and Q. Qi. Robust unsupervised video anomaly detection by multipath frame prediction. IEEE Transactions on Neural Networks and Learning Systems, 2021. [106] X. Wang, R. Girshick, A. Gupta, and K. He. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7794–7803, 2018. [107] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assess- ment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004. [108] Z. Wang, Y. Zou, and Z. Zhang. Cluster attention contrast for video anomaly de- tection. In Proceedings of the 28th ACM International Conference on Multimedia, pages 2463–2471, 2020. [109] R.Windsor,A.Jamaludin,T.Kadir,andA.Zisserman.Context-awaretransformers for spinal cancer detection and radiological grading. In MICCAI, pages 271–281. Springer, 2022. [110] J.-C. Wu, D.-J. Chen, C.-S. Fuh, and T.-L. Liu. Learning unsupervised metaformer for anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4369–4378, 2021. [111] J.-C. Wu, H.-Y. Hsieh, D.-J. Chen, C.-S. Fuh, and T.-L. Liu. Self-supervised sparse representation for video anomaly detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIII, pages 729–745. Springer, 2022. [112] J.-C. Wu, B.-J. Lin, B.-Y. Zeng, L.-C. Fu, C.-S. Fuh, and T.-L. Liu. Cast search via two-stream label propagation. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Oct 2019. [113] J.-C. Wu, S. Lu, C.-S. Fuh, and T.-L. Liu. One-class anomaly detection via novelty normalization. Computer Vision and Image Understanding, 210:103226, 2021. [114] P. Wu, J. Liu, Y. Shi, Y. Sun, F. Shao, Z. Wu, and Z. Yang. Not only look, but also listen: Learning multimodal violence detection under weak supervision. In European conference on computer vision, pages 322–339. Springer, 2020. [115] Y. Xia, J. Yao, L. Lu, L. Huang, G. Xie, J. Xiao, A. Yuille, K. Cao, and L. Zhang. Effective pancreatic cancer screening on non-contrast ct scans via anatomy-aware transformers. In MICCAI, pages 259–269. Springer, 2021. [116] H.Xiao,K.Rasul,andR.Vollgraf.Fashion-mnist:anovelimagedatasetforbench- marking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017. [117] H. Xu, A. Das, and K. Saenko. R-C3D: region convolutional 3d network for tem- poral activity detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5794–5803, 2017. [118] F. Yang, H. Yang, J. Fu, H. Lu, and B. Guo. Learning texture transformer network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5791–5800, 2020. [119] G.Yu,S.Wang,Z.Cai,E.Zhu,C.Xu,J.Yin,andM.Kloft.Clozetesthelps:Effec- tive video anomaly detection via learning to complete video events. In Proceedings of the 28th ACM International Conference on Multimedia, pages 583–591, 2020. [120] S. Zagoruyko and N. Komodakis. Wide residual networks. In British Machine Vision Conference 2016. British Machine Vision Association, 2016. [121] M. Z. Zaheer, A. Mahmood, M. Astrid, and S.-I. Lee. Claws: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detec- tion. In ECCV, pages 358–376. Springer, 2020. [122] M.Z.Zaheer,A.Mahmood,M.H.Khan,M.Segu,F.Yu,andS.-I.Lee.Generative cooperative learning for unsupervised video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14744–14754, 2022. [123] V. Zavrtanik, M. Kristan, and D. Skčaj. Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, page 107706, 2020. [124] S. Zhai, Y. Cheng, W. Lu, and Z. Zhang. Deep structured energy based models for anomaly detection. In International conference on machine learning, pages 1100–1109. PMLR, 2016. [125] C. Zhang, M. Cao, D. Yang, J. Chen, and Y. Zou. Cola: Weakly-supervised tem- poral action localization with snippet contrastive learning. In CVPR, pages 16010– 16019, 2021. [126] J. Zhang, T. Zhang, Y. Dai, M. Harandi, and R. Hartley. Deep unsupervised saliency detection: A multiple noisy labeling perspective. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9029–9038, 2018. [127] H.Zhao,Q.Zheng,C.Teng,R.Yasrab,L.Drukker,A.T.Papageorghiou,andJ.A. Noble. Towards unsupervised ultrasound video clinical quality assessment with multi-modality data. In MICCAI, pages 228–237. Springer, 2022. [128] T. Zhao and X. Wu. Pyramid feature attention network for saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3085–3094, 2019. [129] J.-X. Zhong, N. Li, W. Kong, S. Liu, T. H. Li, and G. Li. Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1237–1246, 2019. [130] K.Zhou,Y.Xiao,J.Yang,J.Cheng,W.Liu,W.Luo,Z.Gu,J.Liu,andS.Gao.En- coding structure-texture relation with p-net for anomaly detection in retinal images. In European conference on computer vision, pages 360–377. Springer, 2020. [131] B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. Cho, and H. Chen. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In International conference on learning representations, 2018.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87952	-
dc.description.abstract	異常檢測主要目的在於發掘空間中或時序上等各個領域的不規則模式，而異常之基本定義主要來自於與正常之間的差異部分。該任務前景廣闊，可廣泛應用於工業領域，近年來也吸引到電腦視覺領域的關注。因此，開發一種具備高準確度的自動驗證瑕疵或異常事件的演算法尤其重要。本論文包括四個部分：模擬單一類異常檢測、真實世界圖像異常檢測、影片異常檢測和醫學疾病檢測，其中每個部分著重在各式各樣的異常檢測並且所包含的開發方法是獨立的。每個部分都已被接受為期刊或會議論文。在論文的第一部分，我們解決了模擬單一類異常檢測的問題，它在訓練階段只給出了來自常見圖像分類資料集的特定一類。該任務之目的是在測試階段判斷輸入樣本是否屬於該類。根據問題設定的特色，我們將模擬單一類異常檢測歸類為類級分類。當前大多數的方法都有局限性，其中類別的標準僅依賴於重建誤差項。我們通過使用自動編碼器模型提出正規化項來打破這一限制。我們在訓練期間將正則化項與新穎性評分模塊配對，以確定給定圖像和給定一類之間的差異，從而提高我們模型的效率。在對模擬單一類異常檢測進行調查之後，我們尋求實際適用性並處理真實世界圖像異常檢測。與模擬單一類異常檢測不同，真實世界圖像異常檢測旨在檢測測試圖像是否包含缺陷。該檢測進一步分為兩類：異常分類和異常定位，其中前者屬於圖像級分類，後者則是像素級分類。我們提出了一種無監督的通用模型，稱為Metaformer，它利用元學習模型參數來實現高模型適應能力和實例感知變換器來強調定位異常區域的焦點區域，意即探索我們感興趣區域的重建落差。我們解決了真實世界圖像異常檢測問題當中基於重建的方法所常見的兩個關鍵問題：模型適應性和重建落差。前者概括了單一異常檢測模型用以處理廣泛的類別，而後者為定位異常區域提供了有用的線索。在第三部分中，我們解決了一個更具挑戰性的任務：影片異常檢測。該任務旨在捕獲異常事件並將它們定位在時間序列中。在此任務中，時間上的異常通常比空間上異常更為嚴重。我們為基於學習的模型引入了自監督式稀疏表示。我們提出了en-Normal和de-Normal模塊做為我們的核心組件，它們相對地利用了事先學習的的字典。前者用於獲取其重建正常事件特徵，而後者用於濾除正常事件特徵。該架構可同時解決單一類和弱監督式影片異常檢測，從而展現其高靈活性。在論文的最後一部分中，我們擴展了異常檢測的一般性並探索了我們提出的影片異常檢測方法的適用性。我們處理兩項疾病檢測任務：影片病理診斷和前列腺癌檢測。儘管任務的所給定資料型態不同，但我們將它們一起視為多實例學習問題。我們引入了一種方法，可以在每個實例中分離健康和不健康的特徵成分。有了這樣的分離特徵，我們可以突出顯示患病的特徵成分，以準確推理疾病評分。	zh_TW
dc.description.abstract	Anomaly Detection (AD) aims to discover irregular patterns in various domains, such as spatial or temporal, while the difference between abnormal and normal fundamentally defines the anomaly. The task is promising, extensively applicable in the industrial field, and recently attracted interest in computer vision association. Accordingly, developing an algorithm for automatically verifying defects or anomalous events with reliable accuracy is crucial. This dissertation comprises four parts: simulated one-class AD, real-world image AD, video AD, and medical disease detection, where the developed approaches are independent but focus on various AD. Each part has been accepted as a journal or conference paper. In the first part of the dissertation, we address the problem of simulated one-class AD, which is given only a particular class (representing the normal class here) from classic image classification benchmarks during the training stage. The objective is to determine whether the input samples belong to that class in the testing phase. By the characteristic of problem setup, we categorize simulated one-class AD as a class-level classification. Most currently conventional methods have limitations in which the criterion for the novel class solely relies on the reconstruction error term. We break this restriction by proposing a normalization term with an autoencoder model. We pair the regularization with an additive novelty scoring module during training to determine the difference between a given image and the anomaly-free class, improving our model's efficiency. Following the investigation for simulated one-class AD, we seek practical applicability and deal with real-world image AD. Unlike the simulated one-class AD, real-world image AD aims to detect whether the testing image contains defects. The detection is further divided into two categories: anomaly classification and anomaly localization, where the first belongs to the image-level classification, and the other is pixel-level classification. We propose an unsupervised universal model, termed Metaformer, which leverages both meta-learned model parameters to achieve high model adaptation capability and instance-aware transformer to emphasize the focal regions for localizing abnormal regions, i.e., to explore the reconstruction gap at those regions of interest. We address two pivotal issues of reconstruction-based approaches to real-world image AD in images: model adaptation and reconstruction gap. The former generalizes an AD model to tackle a broad range of object categories, while the latter provides helpful clues for localizing abnormal regions. In the third part, we tackle a more challenging task, video AD, which aims to capture anomalous events and localize them in the temporal sequence. In this task, temporal anomalies are generally more critical than spatial anomalies. We introduce the self-supervised sparse representation for the learning-based model. Particularly, we design the en-Normal and de-Normal modules as our core components which leveraged oppositely with the learned task-specific dictionary. The former is used to obtain its reconstructed normal-event feature, while the latter is applied to filter out the normal-event feature. The flexibility is attached to the proposed architecture that generally carries out both one-class and weakly-supervised video AD. In the last part, we extend the generalization and explore the applicability of our proposed method for video AD. We tackle two disease detection tasks: the video pathological diagnosis and prostate cancer detection. Although the modalities of the tasks are disparate, we cast them as Multiple Instance Learning (MIL) problems together. We introduce an approach that decouples healthy and unhealthy feature ingredients per instance. With such decoupled features, we can highlight the diseased feature ingredients for accurately reasoning the disease score.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-07-31T16:30:32Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-07-31T16:30:32Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee..i 摘要...................................................iii Abstract................................................v Contents...............................................ix List of Figures......................................xiii List of Tables.........................................xv Chapter 1 Introduction..................................1 1.1 Preliminary.........................................1 1.2 Overview............................................2 1.2.1 Simulated One-class Anomaly Detection.............3 1.2.2 Real-world Image Anomaly Detection................5 1.2.3 Video Anomaly Detection...........................9 1.2.4 Disease Detection................................11 1.3 Organization.......................................12 Chapter 2 Simulated One-Class Anomaly Detection........15 2.1 Related Work.......................................15 2.1.1 Unsupervised Learning and Anomaly Detection......15 2.1.2 Kernel-based One-class Anomaly Detection.........16 2.1.3 Deep learning-based approaches...................16 2.2 Proposed Method....................................18 2.2.1 Motivation.......................................18 2.2.2 Proposed Strategy................................19 2.3 Experiments........................................22 2.3.1 Datasets.........................................22 2.3.2 Network Architecture.............................24 2.3.3 Results..........................................25 2.3.4 Ablation Study...................................27 2.3.5 Multiclass Anomaly Detection.....................29 2.4 Summary............................................30 Chapter 3 Real-world Image Anomaly Detection...........31 3.1 Related Work.......................................31 3.2 Our Method.........................................36 3.2.1 Model Learning Strategy .........................36 3.2.2 Instance-aware Metaformer........................41 3.3 Experiments........................................44 3.3.1 Anomaly Classification...........................46 3.3.2 Anomaly Localization.............................47 3.3.3 Ablation Study...................................47 3.3.4 Visualization....................................49 3.4 Summary............................................52 Chapter 4 Video Anomaly Detection......................53 4.1 Related Work.......................................53 4.1.1 Various Anomaly Detection........................53 4.1.2 Video Feature Extractors.........................54 4.1.3 Self-Supervised Sparse Dictionary Learning.......55 4.2 Our Method.........................................56 4.2.1 Sparse Representation for oVAD...................57 4.2.2 A Dictionary with Two Modules....................59 4.2.3 Dictionary-based Self-Supervised Learning........61 4.2.4 S3R: A Unified Video AD Framework................63 4.3 Experiments........................................64 4.3.1 Dataset and Metric...............................64 4.3.2 Implementation Details...........................66 4.3.3 Results of oVAD..................................68 4.3.4 Results of wVAD..................................68 4.3.5 Ablation Study...................................69 4.3.6 Visualization....................................70 4.4 Summary............................................71 Chapter 5 Disease Detection............................75 5.1 Motivation.........................................75 5.2 Related Work.......................................76 5.2.1 Disease Detection................................76 5.2.2 Contrastive Learning ............................76 5.3 Our Method.........................................77 5.3.1 Memory Bank Construction.........................78 5.3.2 Contrastive Feature Decoupling ..................79 5.3.3 Regularization...................................81 5.4 Experiments........................................82 5.4.1 Dataset and Metric ..............................82 5.4.2 Implementation Details...........................83 5.4.3 Comparison Results...............................83 5.4.4 Ablation Study ..................................84 5.5 Summary............................................86 Chapter 6 Conclusions and Future Work..................87 6.1 Conclusions........................................87 6.2 FutureWork.........................................89 References.............................................91	-
dc.language.iso	en	-
dc.title	深度學習於視覺之異常偵測	zh_TW
dc.title	Deep Learning for Visual Anomaly Detection	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	博士	-
dc.contributor.coadvisor	傅楸善	zh_TW
dc.contributor.coadvisor	Chiou-Shann Fuh	en
dc.contributor.oralexamcommittee	陳祝嵩;許秋婷;李明穗;林彥宇	zh_TW
dc.contributor.oralexamcommittee	Chu-Song Chen;Chiou-Ting Hsu;Ming-Sui Lee;Yen-Yu Lin	en
dc.subject.keyword	深度學習,異常偵測,非監督式學習,自監督式學習,	zh_TW
dc.subject.keyword	deep learning,anomaly detection,unsupervised learning,self-supervised learning,	en
dc.relation.page	108	-
dc.identifier.doi	10.6342/NTU202301248	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2023-07-04	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 目前未授權公開取用	32.28 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。