請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21970完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 歐陽彥正 | |
| dc.contributor.author | Lih-ching Chou | en |
| dc.contributor.author | 周立晴 | zh_TW |
| dc.date.accessioned | 2021-06-08T03:55:40Z | - |
| dc.date.copyright | 2018-08-21 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2018-08-15 | |
| dc.identifier.citation | [1] H. Bozdogan, “Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions,” Psychometrika, vol. 52, no. 3, pp. 345-370, 1987.
[2] R. J. Steele, and A. E. Raftery, “Performance of Bayesian model selection criteria for Gaussian mixture models,” Frontiers of statistical decision making and bayesian analysis, vol. 2, pp. 113-130, 2010. [3] D. W. Scott, Multivariate density estimation: theory, practice, and visualization: John Wiley & Sons, 2015. [4] D. O. Loftsgaarden, and C. P. Quesenberry, “A nonparametric estimate of a multivariate density function,” The Annals of Mathematical Statistics, vol. 36, no. 3, pp. 1049-1051, 1965. [5] A. Bergstrom, “The estimation of nonparametric functions in a Hilbert space,” Econometric Theory, vol. 1, no. 1, pp. 7-26, 1985. [6] C. J. Stone, “Consistent nonparametric regression,” The annals of statistics, pp. 595-620, 1977. [7] B. W. Silverman, Density estimation for statistics and data analysis: Routledge, 1986. [8] E. Parzen, “On estimation of a probability density function and mode,” The annals of mathematical statistics, vol. 33, no. 3, pp. 1065-1076, 1962. [9] G. R. Terrell, and D. W. Scott, “Variable kernel density estimation,” The Annals of Statistics, pp. 1236-1265, 1992. [10] J. E. Chacón, T. Duong, and M. Wand, “Asymptotics for general multivariate kernel density derivative estimators,” Statistica Sinica, pp. 807-840, 2011. [11] S.-T. Chiu, “Bandwidth selection for kernel density estimation,” The Annals of Statistics, pp. 1883-1905, 1991. [12] Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation via diffusion,” The annals of Statistics, vol. 38, no. 5, pp. 2916-2957, 2010. [13] M. Rudemo, “Empirical choice of histograms and kernel density estimators,” Scandinavian Journal of Statistics, pp. 65-78, 1982. [14] C. R. Loader, “Bandwidth selection: classical or plug-in?,” The Annals of Statistics, vol. 27, no. 2, pp. 415-438, 1999. [15] Q. Wang, and B. G. Lindsay, “Improving cross-validated bandwidth selection using subsampling-extrapolation techniques,” Computational Statistics & Data Analysis, vol. 89, pp. 51-71, 2015. [16] V. A. Epanechnikov, “Non-parametric estimation of a multivariate probability density,” Theory of Probability & Its Applications, vol. 14, no. 1, pp. 153-158, 1969. [17] M. Jones, and D. Signorini, “A comparison of higher-order bias kernel density estimators,” Journal of the American Statistical Association, vol. 92, no. 439, pp. 1063-1073, 1997. [18] C. M. Van der Walt, and E. Barnard, “Variable kernel density estimation in high-dimensional feature spaces,” 2017. [19] J. M. Leiva-Murillo, and A. ArtéS-RodríGuez, “Algorithms for maximum-likelihood bandwidth selection in kernel density estimators,” Pattern Recognition Letters, vol. 33, no. 13, pp. 1717-1724, 2012. [20] L. Breiman, W. Meisel, and E. Purcell, “Variable kernel estimates of multivariate densities,” Technometrics, vol. 19, no. 2, pp. 135-144, 1977. [21] Y.-J. Oyang, S.-C. Hwang, Y.-Y. Ou, C.-Y. Chen, and Z.-W. Chen, “Data classification with radial basis function networks based on a novel kernel density estimation algorithm,” IEEE transactions on neural networks, vol. 16, no. 1, pp. 225-236, 2005. [22] I. S. Abramson, “On bandwidth variation in kernel estimates-a square root law,” The annals of Statistics, pp. 1217-1223, 1982. [23] ESRI, 'ArcGIS Desktop: Release 10.5,' Redlands, CA: Environmental Systems Research Institute, 2011. [24] A. Okabe, T. Satoh, and K. Sugihara, “A kernel density estimation method for networks, its computational method and a GIS‐based tool,” International Journal of Geographical Information Science, vol. 23, no. 1, pp. 7-32, 2009. [25] D. W. Scott, and S. R. Sain, “Multidimensional density estimation,” Handbook of statistics, vol. 24, pp. 229-261, 2005. [26] J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with gpus,” arXiv preprint arXiv:1702.08734, 2017. [27] J. Huang, and C. X. Ling, “Using AUC and accuracy in evaluating learning algorithms,” IEEE Transactions on knowledge and Data Engineering, vol. 17, no. 3, pp. 299-310, 2005. [28] J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural networks, vol. 61, pp. 85-117, 2015. [29] C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” arXiv preprint arXiv:1706.04599, 2017. [30] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, 'Generative adversarial nets.' pp. 2672-2680. [31] K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recognition.' pp. 770-778. [32] A. Krizhevsky, and G. Hinton, Learning multiple layers of features from tiny images, Citeseer, 2009. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21970 | - |
| dc.description.abstract | 密度估計的研究發展出許多演算法,對於各樣科系的資料分析都有極大的影響力。但是密度估計對於在高維度的資料卻表現不佳。這篇研究探討傳統的密度估計運在高維度資料的遇到的問題,為什麼密度估計所出來的結果可能極低到會被雜訊影像或級高到無法做適當的比較。這篇研究提出在高維資料中若資料的維度不確定的時候應該運用負距離最鄰近資料的對數來做密度的估計。這篇研究也將所提出的演算法用在接近十萬維度的資料上,並且有良好的表現。 | zh_TW |
| dc.description.abstract | The study of density estimation has produced algorithms that has been used across many disciplines and has become a common fixture in the analysis of data. However density estimation has not been able to perform well on high-dimensional datasets. In this study, we discuss the reasons that traditional density estimation would not work well for high dimensional data. Why they give values that are uninterpretable, with either the values so low that the values may be greatly affected by the model noise or computational noise, or the values are so high where we cannot compute the ratio of infinity over infinity. This study proposes using negative log distance to k nearest neighbors as the metric to compare when the dimension of the samples are not known. The resulting classifier, HDDE, was used to classify images in domains with close to 100k dimensions with reasonable results. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T03:55:40Z (GMT). No. of bitstreams: 1 ntu-107-D99922027-1.pdf: 2260900 bytes, checksum: 9869c97dbd38925a176c64417facdb07 (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | 口試委員會審定書 i
誌謝 ii 中文摘要 iii Abstract iv Table of Contents v List of Figures viii List of Tables x Chapter 1 Density Estimation 1 1.1 Density Estimation 1 1.2 Evaluation Criteria 3 Chapter 2 Review of Density Estimation Techniques 5 2.1 Histogram 5 2.2 Naive Estimator 5 2.3 K Nearest Neighbor Estimator 6 2.4 Other Approaches 6 2.5 Kernel Density Estimation 7 2.5.1 Analysis of Performance 8 2.5.2 Bandwidth selection 10 2.5.3 Kernel selection 13 2.5.4 Multivariate Kernels 14 2.5.5 Variable bandwidth methods 15 2.5.6 Recent approaches 19 Chapter 3 Density Estimation in High Dimensions 21 3.1 Problems with PDF in high-dimensions 21 3.1.1 The Diverging Density Problem 21 3.1.2 The all-dimensions-are-significant problem 25 3.1.3 The class-comparison problem 27 3.2 Distance as a primary metric 29 3.3 Finding the Dimensions of the Distribution 33 3.3.1 Using PCA to find dimensions of the samples 33 3.3.2 Using average distance to K nearest neighbors to find dimensions of the samples 35 3.3.3 Discussion 37 Chapter 4 Application to Classification 39 4.1 High Dimension Density Estimator 39 4.1.1 KNN algorithm 40 4.1.2 Choosing K 41 4.2 Experiments on Neural Networks 42 4.2.1 Convolutional Neural Network 43 4.2.2 Dataset 43 4.2.3 HDDE classification 44 4.2.4 Results 44 4.3 Discussion 47 Chapter 5 Discussion and Conclusion 49 5.1 Importance of each dimension 49 5.2 Distribution of the object of interest 50 5.3 Limitations for HDDE 50 5.4 Future work 51 Bibliography 52 | |
| dc.language.iso | en | |
| dc.title | 以最近鄰居距離作高維度空間密度估計 | zh_TW |
| dc.title | Density Estimation in High Dimensions Using Distance to K Nearest Neighbors | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 106-2 | |
| dc.description.degree | 博士 | |
| dc.contributor.oralexamcommittee | 韓謝忱,賴飛羆,陳倩瑜,黃乾綱 | |
| dc.subject.keyword | 核密度估計,高維度,分類器, | zh_TW |
| dc.subject.keyword | density estimation,high dimension,classifier, | en |
| dc.relation.page | 53 | |
| dc.identifier.doi | 10.6342/NTU201803453 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2018-08-15 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf 未授權公開取用 | 2.21 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
