請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98536完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李家岩 | zh_TW |
| dc.contributor.advisor | Chia-Yen Lee | en |
| dc.contributor.author | 洪佑鑫 | zh_TW |
| dc.contributor.author | Yu-Hsin Hung | en |
| dc.date.accessioned | 2025-08-14T16:29:42Z | - |
| dc.date.available | 2025-08-15 | - |
| dc.date.copyright | 2025-08-14 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-08-04 | - |
| dc.identifier.citation | Aas, K., Jullum, M., and Løland, A. (2021). Explaining individual predictions when features are dependent: More accurate approximations to shapley values. Artificial Intelligence, 298:103502.
Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Acharya, U. R., et al. (2021). A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion, 76:243–297. Abdullah, T. A., Zahid, M. S. M., Ali, W., and Hassan, S. U. (2023). B-LIME: An improvement of LIME for interpretable deep learning classification of cardiac arrhythmia from ecg signals. Processes, 11(2):595. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., and Kim, B. (2018). Sanity checks for saliency maps. In Advances in Neural Information Processing Systems, volume 31. Ahmed, M., Rabiul Islam, S., Anwar, A., Moustafa, N., and Khan Pathan, A.-S. (2022). Explainable Artificial Intelligence for Cyber Security, volume 1025. Springer. Alvarez-Melis, D. and Jaakkola, T. S. (2018). On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049. Ancona, M., Ceolini, E., Öztireli, C., and Gross, M. (2017). Towards better understanding of gradient-based attribution methods for deep neural networks. arXiv preprint arXiv:1711.06104. Angelov, P. P., Soares, E. A., Jiang, R., Arnold, N. I., and Atkinson, P. M. (2021). Explainable artificial intelligence: an analytical review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(5):e1424. Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2016). Machine bias. ProPublica. Antorán, J., Bhatt, U., Adel, T., Weller, A., and Hernández-Lobato, J. M. (2020). Getting a clue: A method for explaining uncertainty estimates. arXiv preprint arXiv:2006.06848. Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58:82–115. Artelt, A., Malialis, K., Panayiotou, C. G., Polycarpou, M. M., and Hammer, B. (2023). Unsupervised unlearning of concept drift with autoencoders. In 2023 IEEE Symposium Series on Computational Intelligence (SSCI), pages 703–710. IEEE. Asuncion, A. and Newman, D. (2007). UCI machine learning repository. http://archive.ics.uci.edu/ml. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., and Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE, 10(7):e0130140. Bansal, N., Agarwal, C., and Nguyen, A. (2020). Sam: The sensitivity of attribution methods to hyperparameters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8673–8683. Barocas, S. and Selbst, A. D. (2017). Fairness and machine learning. Communications of the ACM, 61(6):54–63. Becker, A. and Liebig, T. (2022). Evaluating machine unlearning via epistemic uncertainty. arXiv preprint arXiv:2208.10836. Belle, V. and Papantonis, I. (2021). Principles and practice of explainable machine learning. Frontiers in Big Data, 4:688969. Biggio, B., Nelson, B., and Laskov, P. (2012). Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389. Bourtoule, L., Chandrasekaran, V., Choquette-Choo, C. A., Jia, H., Travers, A., Zhang, B., Lie, D., and Papernot, N. (2021). Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pages 141–159. IEEE. Bramhall, S., Horn, H., Tieu, M., and Lohia, N. (2020). Qlime-a quadratic local interpretable model-agnostic explanation approach. SMU Data Science Review, 3(1):4. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32. Brooks, S. (1998). Markov chain monte carlo method and its application. Journal of the Royal Statistical Society: Series D (The Statistician), 47(1):69–100. Brown, K. E. and Talbert, D. A. (2022). Using explainable AI to measure feature contribution to uncertainty. In The International FLAIRS Conference Proceedings, volume 35. Buçinca, Z., Lee, M., Aziz, A., and Doshi-Velez, F. (2021). Trust is hard to build, easy to lose: A user study of (un)fairness in machine learning explanations. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–13. Byrne, R. M. (2019). Contrastive explanations: A structural–model approach. The British Journal for the Philosophy of Science. Cadet, X. F., Borovykh, A., Malekzadeh, M., Ahmadi-Abhari, S., and Haddadi, H. (2024). Deep unlearn: Benchmarking machine unlearning. arXiv preprint arXiv:2410.01276. Cai, H., Lin, Y., and Zhang, Z. (2023). Revisiting feature perturbations and explanations in high-dimensional models. IEEE Transactions on Pattern Analysis and Machine Intelligence. Cao, Y. and Yang, J. (2015). Towards making systems forget with machine unlearning. In 2015 IEEE Symposium on Security and Privacy, pages 463–480. IEEE. Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., and Song, D. (2019). The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX security 19), pages 267–284. Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T., Song, D., Erlingsson, U., et al. (2021). Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), pages 2633–2650. Chandrasekaran, S., Subramanian, V., Ma, R., et al. (2022). Measuring trust in post hoc explanation methods: A randomized controlled trial. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, pages 11979–11987. Chen, R., Yang, J., Xiong, H., Bai, J., Hu, T., Hao, J., Feng, Y., Zhou, J. T., Wu, J., and Liu, Z. (2023). Fast model debias with machine unlearning. In Advances in Neural Information Processing Systems, volume 36, pages 14516–14539. Cohen, J., Byon, E., and Huan, X. (2023). To trust or not: Towards efficient uncertainty quantification for stochastic shapley explanations. In PHM Society Asia-Pacific Conference, volume 4. Covert, I., Lundberg, S., and Lee, S.-I. (2021). Explaining by removing: A unified framework for model explanation. Journal of Machine Learning Research, 22(209):1–90. Covert, I., Lundberg, S. M., and Lee, S.-I. (2020). Understanding global feature contributions with additive importance measures. In Advances in Neural Information Processing Systems, volume 33, pages 17212–17223. Datta, A., Sen, S., and Zick, Y. (2016). Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE Symposium on Security and Privacy (SP), pages 598–617. IEEE. Davis, R. A., Lii, K.-S., and Politis, D. N. (2011). Remarks on some nonparametric estimates of a density function. Selected Works of Murray Rosenblatt, pages 95–100. Dhurandhar, A., Chen, P.-Y., Luss, R., Tu, C.-C., Ting, P., Shanmugam, K., and Das, P. (2018). Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In Advances in neural information processing systems, volume 31. Dombrowski, A.-K., Alber, M., Anders, C., Ackermann, M., Müller, K.-R., and Kessel, P. (2019). Explanations can be manipulated and geometry is to blame. In Advances in Neural Information Processing Systems, volume 32. Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. Dwivedi, R., Dave, D., Naik, H., Singhal, S., Omer, R., Patel, P., Qian, B., Wen, Z., Shah, T., Morgan, G., et al. (2023). Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Computing Surveys, 55(9):1–33. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2014). Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pages 214–226. ACM. Fredrikson, M., Jha, S., and Ristenpart, T. (2015). Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1322–1333. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, pages 1–67. Frye, C., Rowat, C., and Feige, I. (2020). Asymmetric shapley values: incorporating causal knowledge into model-agnostic explainability. In Advances in Neural Information Processing Systems, volume 33, pages 1229–1239. Garreau, D. and Luxburg, U. (2020). Explaining the explainer: A first theoretical analysis of LIME. In International Conference on Artificial Intelligence and Statistics, pages 1287–1296. PMLR. Ghahramani, Z. (2015). Probabilistic machine learning and artificial intelligence. Nature. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Explaining and harnessing adversarial examples. In International Conference on Learning Representations. Goodman, B. and Flaxman, S. (2017). European union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine, 38(3):50–57. Gopalakrishnan, S., Singh, P. R., Yazici, Y., Foo, C.-S., Chandrasekhar, V., and Ambikapathi, A. (2022). Classify and generate: Using classification latent space representations for image generations. Neurocomputing, 471:296–334. Gunning, D. and Aha, D. (2019). DARPA's explainable artificial intelligence (XAI) program. AI Magazine, 40(2):44–58. Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., and Yang, G.-Z. (2019). XAI–explainable artificial intelligence. Science Robotics, 4(37):eaay7120. Guo, C., Goldstein, T., Hannun, A., and Van Der Maaten, L. (2019). Certified data removal from machine learning models. arXiv preprint arXiv:1911.03030. Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. (2017). On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, pages 1321–1330. Guo, Q., Feng, C., Zhang, F., and Wang, S. (2023). Efficient algorithm for budgeted adaptive influence maximization: An incremental rr-set update approach. Proceedings of the ACM on Management of Data, 1(3):1–26. Guo, W., Mu, D., Xu, J., Su, P., Wang, G., and Xing, X. (2018). Lemna: Explaining deep learning based security applications. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 364–379. Hall, P. (2018). On the art and science of machine learning explanations. arXiv preprint arXiv:1810.02909. Heo, J., Joo, S., and Moon, T. (2019). Fooling neural network interpretations via adversarial model manipulation. In Advances in Neural Information Processing Systems, volume 32. Heskes, T., Sijben, E., Bucur, I. G., and Claassen, T. (2020). Causal shapley values: Exploiting causal knowledge to explain individual predictions of complex models. In Advances in Neural Information Processing Systems, volume 33, pages 4778–4789. Hung, Y.-H. and Lee, C.-Y. (2024a). BMB-LIME: LIME with modeling local nonlinearity and uncertainty in explainability. Knowledge-Based Systems, 294:111732. Hung, Y.-H. and Lee, C.-Y. (2024b). KDLIME: KNN-kernel density-based perturbation for local interpretability. In ECML PKDD 2024 Workshops: Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2024): XKDD 2024, Vilnius, Lithuania, September 9–13, 2024. Hutter, F., Kotthoff, L., and Vanschoren, J. (2019). Automated machine learning: Methods, systems, challenges. Springer. Kadzinski, M. and Ciomek, K. (2021). Active learning strategies for interactive elicitation of assignment examples for threshold-based multiple criteria sorting. European Journal of Operational Research, 293(2):658–680. Kenny, E. M., Ford, C., Quinn, M., and Keane, M. T. (2021). Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies. Artificial Intelligence, 294:103459. Kindermans, P.-J., Schütt, K. T., Alber, M., Müller, K.-R., Erhan, D., Kim, B., and Dähne, S. (2017). Learning how to explain neural networks: Patternnet and patternattribution. arXiv preprint arXiv:1705.05598. Koh, P. W. and Liang, P. (2017). Understanding black-box predictions via influence functions. In International Conference on Machine Learning, pages 1885–1894. PMLR. Kovalev, M. S., Utkin, L. V., and Kasimov, E. M. (2020). Survlime: A method for explaining machine learning survival models. Knowledge-Based Systems, 203:106164. Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto. Lage, I., Chen, B. K., He, J., Narayanan, A., Gershman, S., and Doshi-Velez, F. (2019). Evaluation of explanation methods: Theory and practice. In International Conference on Artificial Intelligence and Statistics, pages 1372–1380. Lakkaraju, H., Kamar, E., Caruana, R., and Leskovec, J. (2019). Faithful and customizable explanations of black box models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 131–138. Langer, M., Oster, D., Speith, T., Hermanns, H., Kästner, L., Schmidt, E., Sesing, A., and Baum, K. (2021). What do we want from explainable artificial intelligence (XAI)?–a stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research. Artificial Intelligence, 296:103473. Laugel, T., Lesot, M.-J., Marsala, C., and Detyniecki, M. (2019). The dangers of post-hoc interpretability: Unjustified counterfactual explanations and how to fix them. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), pages 2801–2807. Laugel, T., Renard, X., Lesot, M.-J., Marsala, C., and Detyniecki, M. (2018). Defining locality for surrogates in post-hoc interpretablity. arXiv preprint arXiv:1806.07498. Lee, E., Braines, D., Stiffler, M., Hudler, A., and Harborne, D. (2019). Developing the sensitivity of LIME for better machine learning explanation. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, volume 11006, pages 349–356. SPIE. Li, M. and Zhu, C. (2024). Noisy label processing for classification: A survey. arXiv preprint arXiv:2404.04159. Li, N., Zhou, C., Gao, Y., Chen, H., Zhang, Z., Kuang, B., and Fu, A. (2025).Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects. IEEE Transactions on Neural Networks and Learning Systems. Li, X., Xiong, H., Li, X., Zhang, X., Liu, J., Jiang, H., Chen, Z., and Dou, D. (2023). G-LIME: Statistical learning for local interpretations of deep neural networks using global priors. Artificial Intelligence, 314:103823. Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S. (2020). Explainable ai: A review of machine learning interpretability methods. Entropy, 23(1):18. Lipovetsky, S. and Conklin, M. (2001). Analysis of regression in game theory approach. Applied Stochastic Models in Business and Industry, 17(4):319–330. Liu, B., Yu, X., et al. (2023). A survey of machine unlearning. arXiv preprint arXiv:2302.05443. Liu, Y., Khandagale, S., White, C., and Neiswanger, W. (2021). Synthetic benchmarks for scientific research in explainable machine learning. arXiv preprint arXiv:2106.12543. Löfström, H., Löfström, T., Johansson, U., and Sönströd, C. (2023). Investigating the impact of calibration on the quality of explanations. Annals of Mathematics and Artificial Intelligence, pages 1–18. Löfström, H., Löfström, T., Johansson, U., and Sönströd, C. (2024). Calibrated explanations: With uncertainty information and counterfactuals. Expert Systems with Applications, 246:123154. Lucic, A., Oosterhuis, H., Haned, H., and de Rijke, M. (2022). Focus: Flexible optimizable counterfactual explanations for tree ensembles. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 5313–5322. Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., and Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1):56–67. Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30. Marx, C., Park, Y., Hasson, H., Wang, Y., Ermon, S., and Huan, L. (2023). But are you sure? an uncertainty-aware perspective on explainable ai. In International Conference on Artificial Intelligence and Statistics, pages 7375–7391. PMLR. Mehdiyev, N., Majlatow, M., and Fettke, P. (2023). Explainable artificial intelligence meets uncertainty quantification for predictive process monitoring. In PMAI@ IJCAI, pages 29–32. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6):1–35. Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267:1–38. Minh, D., Wang, H. X., Li, Y. F., and Nguyen, T. N. (2022). Explainable artificial intelligence: a comprehensive review. Artificial Intelligence Review, pages 1–66. Mitchell, M., Wu, S., and Madry, A. (2021). Calibration of modern neural networks: A simple fix for improved reliability. In Proceedings of the NeurIPS Workshop on Reliable Machine Learning. Molnar, C. (2020). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. Lulu.com. Mothilal, R. K., Sharma, A., and Tan, C. (2020). Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 607–617. Müller, H., Kargl, M., Plass, M., Kipperer, B., Brcic, L., Regitnig, P., Geißler, C., Küster, T., Zerbe, N., and Holzinger, A. (2022). Towards a taxonomy for explainable AI in computational pathology. Humanity Driven AI: Productivity, Well-being, Sustainability and Partnership, pages 311–330. Nguyen, T. T., Huynh, T. T., Ren, Z., Nguyen, P. L., Liew, A. W.-C., Yin, H., and Nguyen, Q. V. H. (2022). A survey of machine unlearning. arXiv preprint arXiv:2209.02299. Owen, A. B. (2014). Sobol’ indices and shapley value. SIAM/ASA Journal on Uncertainty Quantification, 2(1):245–251. Paraschakis, K., Castellani, A., Borboudakis, G., and Tsamardinos, I. (2024). Confidence interval estimation of predictive performance in the context of automl. arXiv preprint arXiv:2406.08099. Pearl, J. (2009). Causality: Models, reasoning, and inference. Cambridge university press. Pegios, P., Feragen, A., Hansen, A. A., and Arvanitidis, G. (2024). Counterfactual explanations via riemannian latent space traversal. arXiv preprint arXiv:2411.02259. Platt, J. et al. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 10(3):61–74. Plumb, G., Molitor, D., and Talwalkar, A. (2020). Explaining groups of points in high-dimensional data using local explanations. In International Conference on Artificial Intelligence and Statistics, pages 1195–1205. Plumb, G., Molitor, D., and Talwalkar, A. S. (2018). Model agnostic supervised local explanations. In Advances in Neural Information Processing Systems, volume 31. Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., and Carin, L. (2016). Variationa autoencoder for deep learning of images, labels and captions. In Advances in Neural Information Processing Systems, volume 29. Rahnama, A. H. A. and Boström, H. (2019). A study of data and label shift in the LIME framework. arXiv preprint arXiv:1910.14421. Rasmus, A., Berglund, M., Honkala, M., Valpola, H., and Raiko, T. (2015). Semi-supervised learning with ladder networks. In Advances in Neural Information Processing Systems, volume 28. Redmond, M. and Baveja, A. (2002). A data-driven software tool for enabling cooperative information sharing among police departments. European Journal of Operational Research, 141(3):660–678. Ren, P., Xiao, Y., Chang, X., Huang, P.-Y., Li, Z., Gupta, B. B., Chen, X., and Wang, X. (2021). A survey of deep active learning. ACM Computing Surveys (CSUR), 54(9):1–40. Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). “why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1135–1144. Ribeiro, M. T., Singh, S., and Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215. Russell, S. J. and Norvig, P. (2016). Artificial Intelligence: A Modern Approach. Pearson. Saini, A. and Prasad, R. (2022). Select wisely and explain: Active learning and probabilistic local post-hoc explainability. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 599–608. Schwalbe, G. and Finzel, B. (2023). A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts. Data Mining and Knowledge Discovery, pages 1–59. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 618–626. Shankaranarayana, S. M. and Runje, D. (2019). Alime: Autoencoder based approach for local interpretability. In Intelligent Data Engineering and Automated Learning–IDEAL 2019: 20th International Conference, Manchester, UK, November 14–16, 2019, Proceedings, Part I 20, pages 454–463. Springer. Shen, Y., Zhang, Z., Sabuncu, M. R., and Sun, L. (2021). Real-time uncertainty estimation in computer vision via uncertainty-aware distribution distillation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 707–716. Shokri, R., Stronati, M., Song, C., and Shmatikov, V. (2017). Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), pages 3–18. IEEE. Shrikumar, A., Greenside, P., and Kundaje, A. (2017). Learning important features through propagating activation differences. In International Conference on Machine Learning, pages 3145–3153. PMLR. Shrikumar, A., Greenside, P., Shcherbina, A., and Kundaje, A. (2016). Not just a blackbox: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713. Slack, D., Hilgard, A., Singh, S., and Lakkaraju, H. (2021). Reliable post hoc explanations: Modeling uncertainty in explainability. In Advances in Neural Information Processing Systems, volume 34, pages 9391–9404. Slack, D., Hilgard, S., Jia, E., Singh, S., and Lakkaraju, H. (2020). Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 180–186. Smilkov, D., Thorat, N., Kim, B., Viégas, F., and Wattenberg, M. (2017). Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825. Snoek, J., Larochelle, H., and Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems, volume 25. Sokol, K. and Flach, P. (2020). Explainability fact sheets: A framework for systematic assessment of explainable approaches. In Proceedings of the 2020 ACM Conference on Fairness, Accountability, and Transparency, pages 56–67. Song, H., Kim, M., Park, D., Shin, Y., and Lee, J.-G. (2022). Learning from noisy labels with deep neural networks: A survey. IEEE transactions on neural networks and learning systems, 34(11):8135–8153. Speith, T. (2022). A review of taxonomies of explainable artificial intelligence (XAI) methods. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 2239–2250. Springenberg, J. T., Dosovitskiy, A., Brox, T., and Riedmiller, M. (2014). Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806. Sundararajan, M., Taly, A., and Yan, Q. (2017). Axiomatic attribution for deep networks. In International Conference on Machine Learning, pages 3319–3328. PMLR. Tan, Z., Tian, Y., and Li, J. (2024). Glime: General, stable and local LIME explanation. In Advances in Neural Information Processing Systems, volume 36. Tanno, R., F Pradier, M., Nori, A., and Li, Y. (2022). Repairing neural networks by leaving the right past behind. In Advances in Neural Information Processing Systems, volume 35, pages 13132–13145. Thampi, A. (2022). Interpretable AI: Building explainable machine learning systems. Simon and Schuster. Thuy, A. and Benoit, D. F. (2024). Explainability through uncertainty: Trustworthy decision-making with neural networks. European Journal of Operational Research, 317(2):330–340. Tjoa, E. and Guan, C. (2020). A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems, 32(11):4793–4813. Tran, T. N., Wehrens, R., and Buydens, L. M. (2006). KNN-kernel density-based clustering for high-dimensional multivariate data. Computational Statistics & Data Analysis, 51(2):513–525. Tu, L., Talbot, A., Gallagher, N. M., and Carlson, D. E. (2022). Supervising the decoder of variational autoencoders to improve scientific utility. IEEE Transactions on Signal Processing, 70:5954–5966. Vidal, À. P., Johansen, A. S., Jahromi, M. N., Escalera, S., Nasrollahi, K., and Moeslund, T. B. (2024). Verifying machine unlearning with explainable ai. arXiv preprint arXiv:2411.13332. Vilone, G. and Longo, L. (2021). Classification of explainable artificial intelligence methods through their output formats. Machine Learning and Knowledge Extraction, 3(3):615–661. Voigt, P. and Von dem Bussche, A. (2017). The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 10(3152676):10–5555. Vovk, V. and Petej, I. (2012). Venn-abers predictors. arXiv preprint arXiv:1211.0025. Wachter, S., Mittelstadt, B., and Russell, C. (2018). Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harvard Journal of Law & Technology. Wang, T., Wang, Y., Zhou, J., Peng, B., Song, X., Zhang, C., Sun, X., Niu, Q., Liu, J., Chen, S., et al. (2025). From aleatoric to epistemic: Exploring uncertainty quantification techniques in artificial intelligence. arXiv preprint arXiv:2501.03282. Wang, W., Tian, Z., Zhang, C., and Yu, S. (2024). Machine unlearning: A comprehensive survey. arXiv preprint arXiv:2405.07406. Wang, Y.-H., Chen, P.-Y., et al. (2023). Explaining high-dimensional machine learning models using supervised contrastive explanations. In Proceedings of the 40th International Conference on Machine Learning, pages 12506–12519. Wichert, L. and Sikdar, S. (2024). Rethinking evaluation methods for machine unlearning. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 4727–4739. Wood, D., Papamarkou, T., Benatan, M., and Allmendinger, R. (2024). Model-agnostic variable importance for predictive uncertainty: an entropy-based approach. Data Mining and Knowledge Discovery, 38(6):4184–4216. Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., and Zhu, J. (2019). Explainable AI: A brief survey on history, research areas, approaches and challenges. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8, pages 563–574. Springer. Xu, J., Wu, Z., Wang, C., and Jia, X. (2024). Machine unlearning: Solutions and challenges. IEEE Transactions on Emerging Topics in Computational Intelligence. Zafar, M. R. and Khan, N. (2021). Deterministic local interpretable model-agnostic explanations for stable explainability. Machine Learning and Knowledge Extraction, 3(3):525–541. Zeiler, M. D. and Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pages 818–833. Springer. Zhang, J., Bargal, S. A., Lin, Z., Brandt, J., Shen, X., and Sclaroff, S. (2018). Top-down neural attention by excitation backprop. International Journal of Computer Vision, 126(10):1084–1102. Zhang, X., Chan, F. T., and Mahadevan, S. (2022). Explainable machine learning in image classification models: An uncertainty quantification perspective. Knowledge-Based Systems, 243:108418. Zhao, X., Huang, W., Huang, X., Robu, V., and Flynn, D. (2021). Baylime: Bayesian local interpretable model-agnostic explanations. In Uncertainty in Artificial Intelligence, pages 887–896. PMLR. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2014). Object detectors emerge in deep scene cnns. arXiv preprint arXiv:1412.6856. Zhou, Z., Hooker, G., and Wang, F. (2021). S-lime: Stabilized-lime for model explanation. In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2429–2438. Zhu, R., Liu, H., Wu, R., Lin, M., Lv, T., Fan, C., and Wang, H. (2023). Rethinking noisy label learning in real-world annotation scenarios from the noise-type perspective. arXiv preprint arXiv:2307.16889. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98536 | - |
| dc.description.abstract | 隨著機器學習模型在醫療、金融和刑事司法等高風險領域的廣泛應用,對於具有穩健性、可解釋性和可信度的模型解釋需求日益增加。然而,傳統的可解釋人工智慧方法,如局部可解釋通用模型適用解釋技術,假設決策邊界在局部範圍內呈線性,並依賴基於擾動的採樣方式,導致解釋結果的不穩定性、局部擬合度低,以及特徵貢獻評估的偏差,特別是在非線性決策邊界的情況下。為了解決這些挑戰,我們提出了一系列改進的可解釋人工智慧方法,以提升局部擬合度、穩定性及不確定性量化能力,同時促進機器遺忘技術的發展。
首先,我們提出一種新的非線性局部解釋方法,該方法結合加權多元自適應迴歸樣條與自助法聚合技術,不僅能更準確地捕捉非線性決策邊界,還能在貝氏框架內建模特徵貢獻的不確定性。我們的實驗結果顯示,該方法在局部擬合度和穩定性方面表現優越,並且在多種真實世界與模擬數據集上均展現出穩定的表現。 其次,我們提出改進版的局部可解釋方法,解決了傳統技術因為擾動樣本偏離真實數據分佈而導致的不穩定性問題。該方法透過基於近鄰與核密度建模的技術,使用內部分佈樣本進行擾動,從而提升局部擬合度和穩定性,並增強對抗攻擊的韌性。我們的實驗結果表明,這種改進方法能有效減少錯誤或偏差解釋,確保更可靠的模型可解釋性。 最後,我們將上述可解釋技術應用於機器遺忘,以實現刪除訓練後模型中錯誤數據影響之能力。本研究提出一個整合式框架,結合 反事實隱性不確定性解釋與認證刪除方法,並進一步以監督式變分自編碼器搭配預測重標註機制強化反事實生成品質。此架構可先透過不確定性感知的反事實樣本辨識訓練資料中有害樣本,並透過近似貝氏推論原則實現高效且穩健的資料移除與模型修復。更重要的是,該方法不僅能移除誘發不確定性的資料,亦能透過可信任的反事實樣本進行學習,從而同時提升模型之泛化能力與可解釋性。 通過提升局部擬合度、穩定性、對抗攻擊韌性,以及將不確定性感知技術應用於機器遺忘,我們的研究推動了更加可解釋且可靠的機器學習模型發展,促進其在關鍵應用領域的實際部署。 | zh_TW |
| dc.description.abstract | As machine learning (ML) models increasingly permeate high-stakes domains such as healthcare, finance, and criminal justice, the need for robust, interpretable, and trustworthy explanations intensifies. Traditional eXplainable Artificial Intelligence (XAI) methods, such as Local Interpretable Model-agnostic Explanations (LIME), assume local linearity and rely on perturbation-based sampling, which leads to instability, low fidelity, and biased feature attributions—particularly in non-linear decision boundaries. To address these challenges, we propose novel XAI methods that improve local fidelity, stability, and uncertainty quantification while also facilitating machine unlearning.
First, we introduce BMB-LIME (Bootstrap aggregating Multivariate adaptive regression splines Bayesian LIME), a nonlinear local explainer leveraging weighted multivariate adaptive regression splines (MARS) with bootstrap aggregating. BMB-LIME not only captures non-linear decision boundaries more effectively than existing methods but also models the uncertainty of feature contributions within a Bayesian framework. Experimental results demonstrate its superior local fidelity and stability across diverse real-world and simulated datasets. Second, we present KDLIME (KNN-kernel Density-based perturbation LIME), an improved variant of LIME that addresses instability caused by out-of-distribution perturbations. By employing KNN-kernel density-based perturbation using in-distribution samples, KDLIME achieves higher local fidelity and stability while enhancing robustness against adversarial attacks. Our empirical evaluations confirm that KDLIME mitigates the risk of producing misleading or biased explanations, ensuring more reliable model interpretability. Lastly, we extend these advancements in XAI to the domain of machine unlearning—a critical process for eliminating adversarial or erroneous data influences from trained models. We propose a unified framework that integrates Counterfactual Latent Uncertainty Explanations (CLUE) with Certified Removal (CR), and further enhances CLUE via a supervised variational autoencoder (SVAE) equipped with predictive relabeling. In this framework, CLUE identifies harmful training instances through uncertainty-aware counterfactuals, while CR performs principled and efficient unlearning through approximate Bayesian inference. By simultaneously removing uncertainty-inducing data and learning from low-uncertainty, relabeled counterfactuals, the method improves both the model's generalization and interpretability—ultimately contributing to more transparent, robust, and trustworthy ML systems. By addressing local fidelity, stability, adversarial robustness, and unlearning within an uncertainty-aware XAI framework, our research contributes to the development of interpretable and reliable machine learning models, advancing their deployment in critical applications. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-14T16:29:42Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-08-14T16:29:42Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 致謝 ......................................................................................................................... i
摘要 ......................................................................................................................... iii Abstract .................................................................................................................. v Contents ................................................................................................................... vii List of Figures ......................................................................................................... xi List of Tables .......................................................................................................... xv Chapter 1: Introduction ............................................................................ 1 1.1 Background and Motivation ................................................................. 1 1.2 Problem Description and Research Scope ........................................ 3 1.3 Research Overview .................................................................................. 7 Chapter 2: Literature Review ............................................................ 9 2.1 Evolution of XAI ..................................................................................... 9 2.2 Developing XAI-based ML Systems .................................................. 12 2.3 Taxonomy ............................................................................................... 13 2.3.1 Why Taxonomy? ......................................................................... 13 2.3.2 The State-of-the-Art Taxonomy ................................................ 14 2.3.3 Choosing an Explainability Method ........................................ 17 2.4 Methodologies ......................................................................................... 17 2.4.1 LIME ............................................................................................. 19 2.4.2 SHAP ............................................................................................ 20 2.4.3 Comparison and Discussion .................................................... 21 2.5 LIME Variants ......................................................................................... 22 2.5.1 Nonlinear Explainer .................................................................. 23 2.5.2 Perturbation ............................................................................... 23 2.5.3 Modeling Uncertainty .............................................................. 24 2.6 XAI with Uncertainty Quantification and Machine Unlearning ...... 26 2.6.1 Uncertainty Quantification in XAI ......................................... 26 2.6.2 Machine Unlearning in XAI .................................................... 29 2.6.3 Uncertainty-Aware XAI with Machine Unlearning ............... 31 Chapter 3: BMB-LIME: LIME with Modeling Local Nonlinearity and Uncertainty in Explainability......... 33 3.1 Introduction ............................................................................................. 33 3.2 Main Framework ..................................................................................... 36 3.2.1 LIME with Multivariate Adaptive Regression Spline ...................................................................... 37 3.2.2 BM-LIME: LIME with Bootstrap Aggregating Multivariate Adaptive Regression Spline ........................................ 40 3.2.3 BMB-LIME: BayesLIME with Bootstrap Aggregating Multivariate Adaptive Regression Spline ...................... 41 3.3 Evaluation and Analysis ........................................................................ 45 3.3.1 Experimental Setup ................................................................. 45 3.3.2 Results.......................................46 3.3.2.1 Local Accuracy Evaluation ....................................... 46 3.3.2.2 Stability Evaluation .................................................... 48 3.3.2.3 Credible Intervals Evaluation .................................... 52 3.4 Concluding Remarks ............................................................................. 56 Chapter 4: KDLIME: KNN-Kernel Density-Based Perturbation for Local Interpretability ... 59 4.1 Introduction ............................................................................................. 59 4.2 Main Framework ..................................................................................... 64 4.2.1 KNN-Kernel Density Estimation ........................................... 65 4.2.2 Sampling Perturbation via Markov Chain Monte Carlo ......................................... 68 4.2.3 KDLIME ......................................................................... 69 4.3 Experiments and Analysis ..................................................................... 70 4.3.1 Experimental Setup ................................................................. 70 4.3.1.1 Evaluation with Adversarial Attacks ...................... 72 4.3.2 Results .............................................. 75 4.3.2.1 Evaluation of Stability .............................................. 75 4.3.2.2 Evaluation of Fidelity ................................................. 76 4.4 Concluding Remarks ............................................................................. 77 Chapter 5: Uncertainty-Aware XAI for Machine Unlearning ......................... 79 5.1 Introduction ............................................................................................. 79 5.2 Main Framework ..................................................................................... 83 5.2.1 Model Repair through Uncertainty Deletion and CLUE Relabeling ....................... 84 5.2.2 Uncertainty Identification via Counterfactual Latent Uncertainty Explanations ..................................... 85 5.2.3 Certified Unlearning for Model Repair .............................................................. 92 5.3 Evaluation and Analysis ........................................................................ 93 5.3.1 Experimental Setup................... 94 5.3.1.1 Dataset with Simulated Annotation and Input Noise ................................................... 94 5.3.1.2 Experiment Details ................................................... 95 5.3.1.3 Comparisons ............................................................. 97 5.3.2 Results ......................................................................................... 98 5.3.3 Discussion .................................................................................. 109 5.4 Concluding Remarks ............................................................................. 110 Chapter 6: Conclusions ............................................................................. 113 6.1 Summary .................................................................................................. 113 6.2 Main Contributions ................................................................................ 116 6.3 Discussion and Further Research ....................................................... 117 References .................................................................................................. 123 | - |
| dc.language.iso | en | - |
| dc.subject | 內部分佈擾動 | zh_TW |
| dc.subject | 反事實解釋 | zh_TW |
| dc.subject | 可解釋人工智慧 | zh_TW |
| dc.subject | 機器遺忘 | zh_TW |
| dc.subject | 局部擬合度 | zh_TW |
| dc.subject | 不確定性量化 | zh_TW |
| dc.subject | Explainable Artificial Intelligence | en |
| dc.subject | In-Distribution Perturbation | en |
| dc.subject | Machine Unlearning | en |
| dc.subject | Counterfactual Explanation | en |
| dc.subject | Local Fidelity | en |
| dc.subject | Uncertainty Quantification | en |
| dc.title | 穩定與不確定性感知的統計機器學習於局部通用模型適用可解釋人工智慧 | zh_TW |
| dc.title | Stable and Uncertainty-Aware Statistical Machine Learning for Local Model-Agnostic Explainable Artificial Intelligence | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 博士 | - |
| dc.contributor.oralexamcommittee | 曾新穆;許嘉裕;陳建錦;盧信銘 | zh_TW |
| dc.contributor.oralexamcommittee | Vincent Shin-Mu Tseng;Chia-Yu Hsu;Chien Chin Chen;Hsin-Min Lu | en |
| dc.subject.keyword | 可解釋人工智慧,局部擬合度,不確定性量化,內部分佈擾動,機器遺忘,反事實解釋, | zh_TW |
| dc.subject.keyword | Explainable Artificial Intelligence,Local Fidelity,Uncertainty Quantification,In-Distribution Perturbation,Machine Unlearning,Counterfactual Explanation, | en |
| dc.relation.page | 142 | - |
| dc.identifier.doi | 10.6342/NTU202502909 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2025-08-06 | - |
| dc.contributor.author-college | 管理學院 | - |
| dc.contributor.author-dept | 資訊管理學系 | - |
| dc.date.embargo-lift | N/A | - |
| 顯示於系所單位: | 資訊管理學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 未授權公開取用 | 14.51 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
