請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70852完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李允中 | |
| dc.contributor.author | Che-An Lee | en |
| dc.contributor.author | 李哲安 | zh_TW |
| dc.date.accessioned | 2021-06-17T04:41:01Z | - |
| dc.date.available | 2019-08-08 | |
| dc.date.copyright | 2018-08-08 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2018-08-06 | |
| dc.identifier.citation | [1] D. M. Blei. Probabilistic topic models. Commun. ACM, 55(4):77–84, Apr. 2012.
[2] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993–1022, Mar. 2003. [3] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. In ACM sigmod record, volume 29, pages 93–104. ACM, 2000. [4] T. Calin ́ski and J. Harabasz. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1):1–27, 1974. [5] R. Chinnici, M. Gudgin, J. J. Moreau, and S. Weerawarana. Web services description language (WSDL) version 1.2 w3c working draft. W3C, 9 July 2002. [6] M. Crasso, A. Zunino, and M. Campo. A survey of approaches to web service dis- covery in service-oriented architectures. Journal of Database Management (JDM), 22(1):102–132, 2011. [7] D. L. Davies and D. W. Bouldin. A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence, (2):224–227, 1979. [8] J. C. Dunn. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. 1973. [9] M. Fabian, K. Gjergji, W. Gerhard, et al. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In 16th International World Wide Web Conference, WWW, pages 697–706, 2007. [10] M. Faruqui, J. Dodge, S. K. Jauhar, C. Dyer, E. Hovy, and N. A. Smith. Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:1411.4166, 2014. [11] M. Friedman. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the american statistical association, 32(200):675–701, 1937. [12] J. Ganitkevitch, B. Van Durme, and C. Callison-Burch. Ppdb: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 758–764, 2013. [13] J. Goikoetxea, E. Agirre, and A. Soroa. Single or multiple? combining word repre- sentations independently learned from text and wordnet. In AAAI, pages 2608–2614, 2016. [14] M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. Journal of intelligent information systems, 17(2-3):107–145, 2001. [15] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998. [16] M. Klusch. Overview of the s3 contest: Performance evaluation of semantic service matchmakers. In Semantic web services, pages 17–34. Springer, 2012. [17] M. Klusch and P. Kapahnke. The isem matchmaker: A flexible approach for adaptive hybrid semantic service selection. Web Semantics: Science, Services and Agents on the World Wide Web, 15:1–14, 2012. [18] M. Klusch, P. Kapahnke, S. Schulte, F. Lecue, and A. Bernstein. Semantic web service search: a brief survey. KI-Ku ̈nstliche Intelligenz, 30(2):139–147, 2016. [19] Y. Liu, Z. Li, H. Xiong, X. Gao, and J. Wu. Understanding of internal clustering vali- dation measures. In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 911–916. IEEE, 2010. [20] L. Lova ́sz et al. Random walks on graphs: A survey. Combinatorics, Paul erdos is eighty, 2(1):1–46, 1993. [21] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, pages 142–150. Association for Computational Linguistics, 2011. [22] A. L. Maas and A. Y. Ng. A probabilistic model for semantic word vectors. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, pages 1–8. ACM, 2010. [23] L. McInnes, J. Healy, and S. Astels. hdbscan: Hierarchical density based clustering. The Journal of Open Source Software, 2(11):205, 2017. [24] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed repre- sentations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. [25] G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to wordnet: An on-line lexical database. International journal of lexicography, 3(4):235– 244, 1990. [26] G. Mohr, M. Stack, I. Rnitovic, D. Avery, and M. Kimpton. Introduction to heritrix. In 4th International Web Archiving Workshop, pages 109–115, 2004. [27] N. Mrkˇsi ́c, D. O. S ́eaghdha, B. Thomson, M. Gaˇsi ́c, L. Rojas-Barahona, P.-H. Su, D. Vandyke, T.-H. Wen, and S. Young. Counter-fitting word vectors to linguistic constraints. arXiv preprint arXiv:1603.00892, 2016. [28] A. Murom ̈agi, K. Sirts, and S. Laur. Linear ensembles of word embedding models. arXiv preprint arXiv:1704.01419, 2017. [29] Owls-tc. http://projects.semwebcentral.org/projects/owls-tc/. [30] J. Pennington, R. Socher, and C. Manning. Glove: Global vectors for word rep- resentation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014. [31] P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987. [32] S. S. Shapiro and M. B. Wilk. An analysis of variance test for normality (complete samples). Biometrika, 52(3/4):591–611, 1965. [33] R. Vall ́ee-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot: A java bytecode optimization framework. In CASCON First Decade High Impact Papers, pages 214–224. IBM Corp., 2010. [34] L. Van Der Maaten, E. Postma, and J. Van den Herik. Dimensionality reduction: a comparative. J Mach Learn Res, 10:66–71, 2009. [35] Wikipedia. Plagiarism — Wikipedia, the free encyclopedia, 2004. [Online; accessed 22-July-2004]. [36] H. Xiong, G. Pandey, M. Steinbach, and V. Kumar. Enhancing data analysis with noise removal. IEEE Transactions on Knowledge and Data Engineering, 18(3):304–319, 2006. [37] M. Yu, M. Gormley, and M. Dredze. Factor-based compositional embedding models. In NIPS Workshop on Learning Semantics, pages 95–101, 2014. [38] M. Yu, M. R. Gormley, and M. Dredze. Combining word embeddings and feature embeddings for fine-grained relation extraction. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1374–1379, 2015. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70852 | - |
| dc.description.abstract | 在text-based的網路服務比對方法中,由於將網路服務表示為純文字描述,並 使用文字來表示網路服務和進行比對,因此文字比對的準確性會對服務比對的表 現造成很大的影響。本研究中我們透過下面四個步驟提升服務比對的準確性: 1. 從網路服務描述文件中取得關鍵字,並將關鍵字透過預先訓練的詞向量模型轉換 為向量表示,2. 從參考資料中取得文字關係,3. 將文字關係用於向量結合,改 善預訓練詞向量中的不足,4. 計算關鍵字詞向量的cosine similarity來得到網路服 務相似度。在實驗中我們使用了網路服務比對的benchmark OWLS-TC V4來評估 提出方法的表現,並利用假設檢定將我們的方法與現有的服務比對方法iSeM做比 較,在比較結果中我們的方法 (MAP=0.9242) 表現優於iSeM (MAP=0.8529)。 | zh_TW |
| dc.description.abstract | In text-based service matchmaking approach, since the web service is treated as a plain text and use term tokens as the internal representation to match services, the accuracy of the text comparison will affect the performance of service matchmaking. In this research, we improve the performance of service matchmaking through the following four steps: 1. extract keywords from WSDL and convert them into vector representations through pre-trained word vector model, 2. extract word relations from reference data, 3. use word relations for vector combination to improve the quality of pre-trained word vectors, and 4. calculate the cosine similarity between keyword word vectors to get the similarity of two web services. An experiment is also conducted based on an OWLS-TC V4 service matchmaking benchmark with hypothesis testing to compare our proposed approach with the iSeM approach. The result of the experiment shows that our approach (MAP=0.9242) excels iSeM (MAP=0.8529) by. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-17T04:41:01Z (GMT). No. of bitstreams: 1 ntu-107-R05922096-1.pdf: 2665381 bytes, checksum: 2a35258a791315336eb56dd4025269fa (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | 誌謝 ii
摘要 iii Abstracts iv List of Figures vii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Corpus................................... 4 2.2 WordRepresentations .......................... 6 2.3 VectorCombination............................ 9 2.4 ReferenceData .............................. 11 2.5 Clustering................................. 12 2.6 ServiceMatchmaking........................... 14 Chapter 3 Vector Combination 16 3.1 EntityLinking............................... 17 3.2 VectorCombination............................ 23 Chapter 4 Service Matchmaker 26 4.1 KeywordExtractor ............................ 27 4.2 VectorCombiner ............................. 28 4.3 SimilarityCalculator ........................... 29 Chapter 5 Experiments 31 5.1 EvaluationBenchmark .......................... 31 5.2 Word Representation and Relational Information . . . . . . . . . . . 33 5.3 ExperimentResults............................ 37 5.4 PerformanceAnalysis........................... 40 5.5 Discussion................................. 44 Chapter 6 Conclusion 46 Bibliography 48 A Service Matchmaking Example 52 List of Figures 2.1 DependencyGraph ............................ 5 2.2 Word2VecModel ............................. 7 2.3 LDAModel ................................ 9 2.4 WordNetStructure ............................ 12 3.1 EntityLinkingConcept.......................... 18 3.2 EntityLinkerModule........................... 18 3.3 ClusteringProcess ............................ 20 3.4 WikipediaInformation .......................... 22 3.5 VectorCombinerModule......................... 23 3.6 CombinerPipeline ............................ 24 4.1 ServiceMatchmaker ........................... 26 4.2 KeywordExtractor ............................ 27 4.3 KeywordExtractionExample ...................... 28 4.4 VectorCombiner ............................. 28 4.5 VectorCombinationExample ...................... 29 4.6 SimilarityCalculator ........................... 30 4.7 SimilarityCalculationExample ..................... 30 5.1 Top-KPrecision.............................. 41 5.2 Top-KRecall ............................... 41 5.3 R-Precision ................................ 42 5.4 AveragePrecision............................. 42 A.1 RequestWSDL .............................. 52 A.2 CandidateWSDL............................. 53 A.3 ConvertWordVectors .......................... 53 A.4 CalculateServiceSimilarity ....................... 53 List of Tables 2.1 CompareDifferentVectorCombination................. 10 5.1 SingleRelationResults1......................... 38 5.2 SingleRelationResults2......................... 39 5.3 CombineRelationsResults........................ 40 5.4 NormalityTest .............................. 43 5.5 FriedmanTest............................... 43 | |
| dc.language.iso | en | |
| dc.subject | 網路服務 | zh_TW |
| dc.subject | 服務比對 | zh_TW |
| dc.subject | 文字關係 | zh_TW |
| dc.subject | 向量結合 | zh_TW |
| dc.subject | 文字向量 | zh_TW |
| dc.subject | Vector Combination | en |
| dc.subject | Web Service | en |
| dc.subject | Word Vector | en |
| dc.subject | Word Relation | en |
| dc.subject | Service Matchmaking | en |
| dc.title | 利用向量組合方法改善網路服務匹配 | zh_TW |
| dc.title | Web Services Matchmaking with Vectors Combination | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 106-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 施吉昇,蘇木春,鄭有進,馬尚彬 | |
| dc.subject.keyword | 網路服務,服務比對,文字關係,文字向量,向量結合, | zh_TW |
| dc.subject.keyword | Web Service,Service Matchmaking,Word Relation,Word Vector,Vector Combination, | en |
| dc.relation.page | 54 | |
| dc.identifier.doi | 10.6342/NTU201802568 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2018-08-06 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf 未授權公開取用 | 2.6 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
