請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21379
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 林永松 | |
dc.contributor.author | Yu-Xuan Chen | en |
dc.contributor.author | 陳玉璇 | zh_TW |
dc.date.accessioned | 2021-06-08T03:32:29Z | - |
dc.date.copyright | 2019-08-16 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-08-09 | |
dc.identifier.citation | Federal Trade Commission, “Privacy Online: Fair Information Practices in the Electronic Marketplace: A Federal Trade Commission Report to Congress,” Federal Trade Commission, Washington, DC, May 2000. [online] Available: https://www.ftc.gov/reports/privacy-online-fair-information-practices-electronic-marketplace-federal-trade-commission
“Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46,” Official Journal of the European Union (OJ), vol. 59, no. 1-88, p. 35-36, April, 2016. [online] Available: http://eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX%3A32016R0679. Criteo, “The State of Cross-device Commerce Report,” Criteo, March 2017. [online] Available: https://www.criteo.com/wp-content/uploads/2017/07/Report-criteo-state-of-cross-device-commerce-2016-h2-US.pdf O. Busch, “The Programmatic Advertising Principle,” Programmatic Advertising, pp. 3-24, November 2015. H. Elmeleegy, Yinan Li, Yan Qi, Peter Wilmot, Mingxi Wu, Santanu Kolay, and Ali Dasdan, “Overview of Turn Data Management Platform for Digital Advertising,” in Proc. of the 39th International Conference on Very Large Data Bases, Riva del Garda, Trento, pp. 1138-1149, August 2013. M. Jiang, P. Cui, N. J. Yuan, X. Xie, and S. Yang, “Little is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds,” in Proc. of the 13th AAAI Conference on Artificial Intelligence, Phoenix Convention Center, Phoenix, Arizona, USA, pp. 13-19, February 2016. J. Brookman, P. Rouge, A. Alva, and C. Yeung, “Cross-Device Tracking: Measurement and Disclosures,” in Proc. of the Privacy Enhancing Technologies, Minneapolis, USA, pp. 133-148, July 2017. R. D´ıaz-Moralesl, “Cross-Device Tracking: Matching Devices and Cookies,” in Proc. of the 2015 IEEE International Conference on Data Mining Workshop, Atlantic City, New Jersey, USA, pp. 1699-1704, November 2015. M. C. Phan, Y. Tay, and T.-A. N. Pham, “Cross Device Matching for Online Advertising with Neural Feature Ensembles: First Place Solution at CIKM Cup 2016,” arXiv preprint arXiv:1610.07119, 2016. J. Lian and X. Xie, “Cross-Device User Matching Based on Massive Browse Logs: The Runner-Up Solution for the 2016 CIKM Cup,” arXiv preprint arXiv:1610.03928, 2016. P. W. Buana et al., “Combination of K-nearest Neighbor and K-means based on Term Re-weighting for Classify Indonesian News,” International Journal of Computer Applications, vol. 50, no. 11, pp. 37-42, July 2012. S. E. Sorour, T. Mine, K. Goda, and S. Hirokawa, “Efficiency of LSA and K-means in Predicting Students’ Academic Performance Based on Their Comments data,” CSEdu 2014 International Conference on Computer Supported Education, Barcelona, Spain, pp. 63-74, April 2014. Y. Ko and J. Seo, “Text Categorization Using Feature Projections,” in Proc. of the 19th International Conference on Computational Linguistics, Taipei, Taiwan, pp. 1-7, August-September 2002. R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, “Hyptertext Transfer Protocol - HTTP/1.1,” RFC 2616, June 1999. G. Salton and C. Buckley, “Term Weighting Approaches in Automatic Text Retrieval,” Information Processing and Management, vol. 24, no. 5, pp. 513-523, 1988. W. Zhang, T. Yoshida, and X. Tang, “A Comparative Study of TF* IDF, LSI and Multi-Words for Text Classification,” Expert Systems with Applications, vol. 38, no. 3, pp. 2758-2765, March 2011. H. Om and A. Kundu, “A Hybrid System for Reducing the False Alarm Rate of Anomaly Intrusion Detection System,” in Proc. of the 2012 1st International Conference on Recent Advances in Information Technology, Dhanbad, Jharkhand, India, pp. 131-136, March 2012. R. Ju, P. Zhou, C. H. Li, and L. Liu, “An Efficient Method for Document Categorization Based on Word2vec and Latent Semantic Analysis,” in Proc. of the 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, Liverpool, UK, pp. 2276-2283, October 2015. T. Chen and C. Guestrin, “Xgboost: A Scalable Tree Boosting System,” in Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, USA, pp. 785-794, August 2016. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21379 | - |
dc.description.abstract | 隨著科技裝置的多樣化與快速發展,人們的生活中可能會接觸並同時擁有多個電子 產品,包含:個人電腦、平板、行動裝置等。經常透過這些不同的設備進行網路搜索、 瀏覽等行為早已成為日常現象。這樣的多裝置介入使得電子商務的發生管道更加多元化, 但同時也提升了分析消費行為的複雜程度。究其原因,是因為消費行為的一系列流程不 再像以往一般,由同一個終端設備所擁有,而是可能會分散於各個不同的裝置,而這些 電子裝置在網路上並非實名制度,絕大多數因隱私考量,裝置在網路上擁有匿名特性, 這樣的特性使同一擁有者的裝置們無法經由網路上的瀏覽行為被有效地歸屬於同一位 使用者,為了實現精準行銷或其他客製化之應用,找出裝置之間的真實關係是必不可少 的步驟。將同一人在數個電子設備上產生的瀏覽紀錄正確地連結,能將用戶在網絡上的 所有行為資訊連結起來,從而找到完整的瀏覽歷程。本篇將以 CIKM Cup 2016 之競賽資 料集作為實驗,藉由瀏覽紀錄提取之特徵作為該設備之屬性,利用潛在語義索引(Latent sematic indexing)表示式,結合監督式學習方法找出任一目標設備的候選設備集合,取代 全數進行兩兩配對之法,達成計算量之減低,並透過非監督式的詞嵌入(word embedding), 將文本資料轉化為詞向量,配合其他特徵轉化生成之向量,作為監督式分類法的輸入, 藉此分類可找出候選集合中,任兩個設備屬同一位使用者之機率,利用上述流程建立出 一個透過擷取瀏覽紀錄展現之偏好,跨裝置鏈結機制。 | zh_TW |
dc.description.abstract | With the rapid development of diversified technology, people may use multiple electronic devices to connect the Internet in daily lives: personal computers, tablets, smartphones, and others. An owner can use the devices for browsing, searching and performing other activities such as purchasing online. Switching between devices enables e-commerce to take place on various platforms. Moreover, the complexity of consumer behavior analysis rises as the number of involving devices grows. The two factors of the difficulty include: a series of purchasing processes may spread across different devices, all the devices on the network are anonymous. As a result, the collected action records from devices cannot be associated with their real owners effectively. To achieve precision marketing or customized applications, finding out the real relation among devices is an indispensable step. Connecting separated browsing logs generated by the same browser on several electronic devices can complete the entire user accessing and browsing history. This research uses the dataset provided by the CIKM CUP 2016 Challenge. The representation of a device is created by extracting features from browsing logs. The computation cost reduced by filtering candidates of a target device instead of comparing all the candidates in pairs. The filtering accomplished by the latent semantic indexing representations and supervised learning. Performing word embedding can turn semantic to vectors through an unsupervised neural ensemble. Adding feature engineering enhances the discrimination of the supervised classifier. The classification provides a probability of any two instances belonging to the same user. Implementing the mentioned sequences above forms a cross-device linking mechanism that extracting preferences from browsing logs. | en |
dc.description.provenance | Made available in DSpace on 2021-06-08T03:32:29Z (GMT). No. of bitstreams: 1 ntu-108-R06725023-1.pdf: 2239218 bytes, checksum: 9a0cf41cf2348ed1d33a2c81555578fd (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | 誌謝 i
中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 3 1.3 Objective 6 1.4 Paper Organization 7 Chapter 2 Literature Review 8 2.1 Consumer Behavior Among Platforms 8 2.2 Cross Device Matching 9 2.3 Prediction from Literal Data 11 Chapter 3 Formulation and Solution Proposal 13 3.1 Dataset Description 13 3.2 Problem Description 15 3.3 Sub Problems Description and Formulation 16 3.3.1 Data Preprocessing 16 3.3.2 Candidate Filtering 17 3.3.2.1 Multi-Layered Representing 18 3.3.2.2 Latent Semantic Indexing 19 3.3.2.3 Combination of K-Means and KNN 22 3.3.3 Word Embedding 24 3.3.4 Feature Extraction 26 3.3.5 Pairwise Classification 27 3.4 Evaluation Indicators 28 Chapter 4 Experiments 31 4.1 Data Preprocessing 31 4.2 Candidate Filtering 33 4.2.1 Multi-Layered Representing 33 4.2.2 Latent Semantic Indexing 34 4.2.3 Combination of K-Means and KNN 37 4.3 Features in Classification 39 4.4 Experimental Results 39 Chapter 5 Conclusions 40 REFERENCES 41 | |
dc.language.iso | en | |
dc.title | 利用歷史紀錄萃取瀏覽行為特徵建立跨裝置識別機制 | zh_TW |
dc.title | A Cross-Device Linking Mechanism Based on Extracting Browsing Behavioral Features from History Logs | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 林宜隆,呂俊賢,鍾順平,溫演福 | |
dc.subject.keyword | 跨裝置,追蹤,潛在語義索引,詞嵌入,監督式學習, | zh_TW |
dc.subject.keyword | cross-device,tracking,latent semantic indexing,word embedding,supervised learning, | en |
dc.relation.page | 43 | |
dc.identifier.doi | 10.6342/NTU201902936 | |
dc.rights.note | 未授權 | |
dc.date.accepted | 2019-08-12 | |
dc.contributor.author-college | 管理學院 | zh_TW |
dc.contributor.author-dept | 資訊管理學研究所 | zh_TW |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 2.19 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。