請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93774完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 莊裕澤 | zh_TW |
| dc.contributor.advisor | Yuh-Jzer Joung | en |
| dc.contributor.author | 邱煒薪 | zh_TW |
| dc.contributor.author | Wei-Hsin Chiu | en |
| dc.date.accessioned | 2024-08-07T17:16:04Z | - |
| dc.date.available | 2024-08-08 | - |
| dc.date.copyright | 2024-08-07 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-08-02 | - |
| dc.identifier.citation | Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., and Zheng, X. (2016). Tensorflow: a system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16, page 265–283, USA. USENIX Association.
Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al. (2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774. Ainslie, J., Lee-Thorp, J., de Jong, M., Zemlyanskiy, Y., Lebron, F., and Sanghai, S. (2023). GQA: Training generalized multi-query transformer models from multi-head checkpoints. In Bouamor, H., Pino, J., and Bali, K., editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 4895–4901, Singapore. Association for Computational Linguistics. Amazon (2023a). Amazon launches generative ai to help sellers write product descriptions. Amazon (2023b). Amazon rolls out ai-powered image generation to help advertisers deliver a better ad experience for customers. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901. Bucur, D. (2020). Top influencers can be identified universally by combining classical centralities. Scientific reports, 10(1):20550. Chopra, A., Avhad, V., and Jaju, S. (2021). Influencer marketing: An exploratory study to identify antecedents of consumer behavior of millennial. Business Perspectives and Research, 9(1):77–91. Dao, T., Fu, D., Ermon, S., Rudra, A., and Ré, C. (2022). Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359. Das, D., Mellempudi, N., Mudigere, D., Kalamkar, D., Avancha, S., Banerjee, K., Sridharan, S., Vaidyanathan, K., Kaul, B., Georganas, E., et al. (2018). Mixed precision training of convolutional neural networks using integer operations. arXiv preprint arXiv:1802.00930. Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K., et al. (2012). Large scale distributed deep networks. Advances in neural information processing systems, 25. Dettmers, T., Pagnoni, A., Holtzman, A., and Zettlemoyer, L. (2024). Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 36. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Burstein, J., Doran, C., and Solorio, T., editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. Felix, R., Rauschnabel, P. A., and Hinsch, C. (2017). Elements of strategic social media marketing: A holistic framework. Journal of business research, 70:118–126. Gan, T., Wang, S., Liu, M., Song, X., Yao, Y., and Nie, L. (2019). Seeking micro-influencers for brand promotion. In Proceedings of the 27th ACM International Conference on Multimedia, pages 1933–1941. Gartner (2023). Gartner experts answer the top generative ai questions for your enterprise. Gross, J. and Von Wangenheim, F. (2022). Influencer marketing on instagram: empirical research on social media engagement with sponsored posts. Journal of Interactive Advertising, 22(3):289–310. Hartmann, J., Exner, Y., and Domdey, S. (2024). The power of generative marketing: Can generative ai create superhuman visual marketing content? Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D., and Steinhardt, J. (2021). Measuring massive multitask language understanding. In International Conference on Learning Representations. Hessel, J., Holtzman, A., Forbes, M., Le Bras, R., and Choi, Y. (2021). CLIPScore: A reference-free evaluation metric for image captioning. In Moens, M.-F., Huang, X., Specia, L., and Yih, S. W.-t., editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7514–7528, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B., De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and Gelly, S. (2019). Parameter-efficient transfer learning for nlp. In International conference on machine learning, pages 2790–2799. PMLR. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685. Hughes, C., Swaminathan, V., and Brooks, G. (2019). Driving brand engagement through online social influencers: An empirical investigation of sponsored blogging campaigns. Journal of marketing, 83(5):78–96. Iris von Arnim (2023). We imagined how our v-neck sweater theo would travel and who would wear it where and how. then the ... Instagram post. [image of Photoshoot]. Kapitan, S. and Silvera, D. H. (2016). From digital media influencers to celebrity endorsers: attributions drive endorser effectiveness. Marketing letters, 27:553–567. Kim, S., Jiang, J.-Y., Nakada, M., Han, J., and Wang, W. (2020). Multimodal post attentive profiling for influencer marketing. In Proceedings of The Web Conference 2020, pages 2878–2884. Kim, S., Jiang, J.-Y., and Wang, W. (2021). Discovering undisclosed paid partnership on social media via aspect-attentive sponsored post learning. In Proceedings of the 14th ACM international conference on web search and data mining, pages 319–327. Kitaev, N., Kaiser, L., and Levskaya, A. (2020). Reformer: The efficient transformer. In International Conference on Learning Representations. Kshetri, N., Dwivedi, Y. K., Davenport, T. H., and Panteli, N. (2023). Generative artificial intelligence in marketing: Applications, opportunities, challenges, and research agenda. Levi Strauss & Co (2023). Ls&co. partners with lalaland.ai. Li, J., Li, D., Savarese, S., and Hoi, S. (2023). Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In International conference on machine learning, pages 19730–19742. PMLR. Lin, Y.-T. and Chen, Y.-N. (2023). LLM-eval: Unified multi-dimensional automatic evaluation for open-domain conversations with large language models. In Chen, Y.-N. and Rastogi, A., editors, Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023), pages 47–58, Toronto, Canada. Association for Computational Linguistics. Maheshwari, H., Goswami, K., Saxena, A., and Srinivasan, B. V. (2024). Social media ready caption generation for brands. arXiv preprint arXiv:2401.01637. Martínez-López, F. J., Anaya-Sánchez, R., Fernández Giordano, M., and Lopez-Lopez, D. (2020). Behind influencer marketing: key marketing decisions and their effects on followers'responses. Journal of Marketing Management, 36(7-8):579–607. Matz, S., Teeny, J., Vaid, S. S., Peters, H., Harari, G., and Cerf, M. (2024). The potential of generative ai for personalized persuasion at scale. Scientific Reports, 14(1):4692. Mazloom, M., Rietveld, R., Rudinac, S., Worring, M., and Van Dolen, W. (2016). Multimodal popularity prediction of brand-related social media posts. In Proceedings of the 24th ACM international conference on Multimedia, pages 197–201. Mehta, S., Sekhavat, M. H., Cao, Q., Horton, M., Jin, Y., Sun, C., Mirzadeh, I., Najibi, M., Belenko, D., Zatloukal, P., et al. (2024). Openelm: An efficient language model family with open-source training and inference framework. arXiv preprint arXiv:2404.14619. Meta, A. (2024). Introducing meta llama 3: The most capable openly available llm to date, 2024. URL https://ai. meta. com/blog/meta-llama-3/. Accessed on April, 26. Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G., and Wu, H. (2018). Mixed precision training. In International Conference on Learning Representations. Noy, S. and Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654):187–192. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. (2021). Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al. (2018). Improving language understanding by generative pre-training. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8):9. Ramachandran, K., Suma, S., Banerjee, D., Mathew, B., Cheepurupalli, N. R., and Mohan, C. R. (2024). The effectiveness of influencer marketing in the age of ai. Journal of Informatics Education and Research, 4(2). Reimers, N. and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. Rein, D., Hou, B. L., Stickland, A. C., Petty, J., Pang, R. Y., Dirani, J., Michael, J., and Bowman, S. R. (2023). Gpqa: A graduate-level google-proof q&a benchmark. arXiv preprint arXiv:2311.12022. Reisenbichler, M., Reutterer, T., Schweidel, D. A., and Dan, D. (2022). Frontiers: Supporting content marketing with natural language generation. Marketing Science, 41(3):441–452. Rillig, M. C., Ågerstrand, M., Bi, M., Gould, K. A., and Sauerland, U. (2023). Risks and benefits of large language models for the environment. Environmental Science & Technology, 57(9):3464–3466. PMID: 36821477. Segev, N., Avigdor, N., and Avigdor, E. (2018). Measuring influence on instagram: a network-oblivious approach. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1009–1012. Shazeer, N. (2020). Glu variants improve transformer. arXiv preprint arXiv:2002.05202. Soni, V. (2023). Adopting generative ai in digital marketing campaigns: An empirical study of drivers and barriers. Sage Science Review of Applied Machine Learning, 6(8):1–15. Statista (2024). Distribution of instagram users worldwide as of april 2024, by age and gender. In cooperation with Kepios. Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., and Liu, Y. (2024). Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063. Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., and Hashimoto, T. B. (2023). Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7. Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., et al. (2023a). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al. (2023b). Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288. Tsugawa, S. and Watabe, K. (2023). Identifying influential brokers on social media from social network structure. In Proceedings of the International AAAI Conference on Web and Social Media, volume 17, pages 842–853. Tuten, T. L. and Solomon, M. R. (2017). Social media marketing-tracy l. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. Vrontis, D., Makrides, A., Christofi, M., and Thrassou, A. (2021). Social media influencer marketing: A systematic review, integrative framework and future research agenda. International Journal of Consumer Studies, 45(4):617–644. Wang, S., Gan, T., Liu, Y., Wu, J., Cheng, Y., and Nie, L. (2022). Micro-influencer recommendation by multi-perspective account representation learning. IEEE Transactions on Multimedia, 25:2749–2760. Wang, W., Bell, J. J., Dotson, J. P., and Schweidel, D. A. (2023). What’s in a name? the impact of artist names in image generative ai prompts. In What’s in a Name? The Impact of Artist Names in Image Generative AI Prompts: Wang, Wen| uBell, J. Jason| uDotson, Jeffrey P.| uSchweidel, David A. [Sl]: SSRN. Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P.-S., Mellor, J., Glaese, A., Cheng, M., Balle, B., Kasirzadeh, A., Biles, C., Brown, S., Kenton, Z., Hawkins, W., Stepleton, T., Birhane, A., Hendricks, L. A., Rimell, L., Isaac, W., Haas, J., Legassick, S., Irving, G., and Gabriel, I. (2022). Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, page 214–229, New York, NY, USA. Association for Computing Machinery. Ye, G., Hudders, L., De Jans, S., and De Veirman, M. (2021). The value of influencer marketing for business: A bibliometric analysis and managerial implications. Journal of advertising, 50(2):160–178. Young, A., Chen, B., Li, C., Huang, C., Zhang, G., Zhang, G., Li, H., Zhu, J., Chen, J., Chang, J., et al. (2024). Yi: Open foundation models by 01. ai. arXiv preprint arXiv:2403.04652. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93774 | - |
| dc.description.abstract | 本論文提出了一種通過微調大型語言模型生成業配貼文內容的方法論。我們利用英語Instagram貼文之公開資料集,並採用Quantized Low-Rank Adaptation(QLoRA) 方法微調了Llama3-8b模型。我們的方法包括向大型語言模型提供各種輸入特徵,包括關鍵意見領袖意即網紅的簡介、貼文標籤、標記的用戶名和由BLIP2生成的圖像描述,以評估這些特徵對生成內容的品質影響。
此外,我們與iKala公司合作,收集了一個繁體中文的Instagram業配文資料集,該資料集雖僅有基於文本的貼文資料,不包括圖片或網紅簡介,但我們仍透過在該資料集上微調Llama3-8b和yi-6b模型,有效評估了其在生成高價值的業配內容貼文方面的性能。 我們的微調方法涉及多種新技術。像是使用QLoRA來減少權重縮放,實現更高效的內存利用,並能夠在資源較少的情況下加快微調,或是通過xformers和Tri Dao的集成方法Flash Attention來更有效率地微調模型,並使用Causal Mask來加速訓練。此外,我們通過混合精度訓練優化了Cross Entropy Loss,這通過使用半精度進行計算和全精度進行權重更新來達到同時減少內存消耗與保持精度。 成效評估方面,我們使用了五種指標:CLIPScore,來衡量圖片和文本之間的適配度;LLM-Eval,一種基於GPT-4o的評估指標,用於評估內容的品質;正確的貼文標籤及標記用戶名之比例;Cosine Similarity,使用預訓練的sentence transformer計算文本的語意適配度;以及人為的綜合評估。我們在英語和中文任務中皆分別使用了GPT-3.5和GPT-4o模型作為比較的基準。 我們的研究結果表明,微調的Llama3-8b和yi-6b模型儘管參數僅有幾十億,但在業配貼文內容方面的表現對比具有約一兆參數量的GPT-4o等明顯更大的模型有著相當具競爭力的表現。這是在使用單個一般消費者級別之GPU的情況下完成的,突顯了這些模型在微調和推理中的高效性。我們還發現,在微調過程中變化輸入內容,例如包括或排除圖像描述和KOL簡介,會顯著影響生成標題的質量和一致性。本研究提供了一種可行且資源利用高效的生成高品質業配貼文內容的方法論,並為未來在此領域的研究建立了一個標竿。 | zh_TW |
| dc.description.abstract | This paper presents a novel approach to generating sponsored post captions by fine-tuning large language models (LLMs). Leveraging an open dataset of English Instagram posts labeled as sponsored content, we fine-tuned the Llama3-8b model using Low-Rank Adaptation (LoRA). Our methodology involved providing the LLM with various input features, including Key Opinion Leader (KOL) biographies, post hashtags, tagged usernames, and image descriptions generated by BLIP2, to assess their impact on the quality and effectiveness of generated captions.
Additionally, in collaboration with iKala, we collected a Traditional Chinese (zh-tw) sponsored post dataset, focusing on text-based post information without image descriptions or KOL biographies. We fine-tuned both the Llama3-8b and yi-6b models on this dataset to evaluate their performance in generating engaging sponsored post captions. Our fine-tuning strategy involved several innovative techniques. We employed Quantized Low-Rank Adaptation (QLoRA) to reduce weight scaling, achieving a more efficient memory footprint and enabling faster fine-tuning with fewer resources. We integrated Flash Attention via xformers and Tri Dao's implementation to optimize transformer models, and utilized a causal mask to speed up training. Additionally, we optimized Cross Entropy Loss through mixed precision training, which reduced memory consumption and enhanced precision by using half-precision for computation and full-precision for weight updates. The evaluation of generated captions was conducted using three methods: CLIPScore to measure alignment between images and text; LLM-Eval, a GPT-4o based evaluation metric assessing content quality; and a comprehensive manual evaluation involving human raters. For baseline comparisons, we employed GPT-3.5 and GPT-4o models in both English and Chinese contexts. Our findings demonstrate that the fine-tuned Llama3-8b and yi-6b models, despite their relatively small size of a few billion parameters, achieve competitive results with significantly larger models such as GPT-4, which has around one trillion parameters. This was accomplished using a single consumer-grade GPU, highlighting the model's efficiency in both fine-tuning and batched inference. We also identified that varying the input content during fine-tuning, such as including or excluding image descriptions and KOL biographies, significantly affects the quality and consistency of the generated captions. This research contributes a viable and resource-efficient pipeline for generating high-quality sponsored post captions and establishes a benchmark for future studies in this domain. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-07T17:16:03Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-07T17:16:04Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 致謝 i
摘要 ii Abstract iv Table of Contents vii List of Figures x List of Tables xi Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Object and Main Contribution 4 1.3 Thesis Organization 5 Chapter 2 Literature Review 7 2.1 Landscape of Influencer Marketing and the Role of GenAI 7 2.2 Evolution and Development of LLMs 8 2.3 Machine Learning Applications in Influencer Marketing 11 2.4 Summary 15 Chapter 3 Methodology 18 3.1 Problem Definition 18 3.2 Proposed Architecture 19 3.2.1 Methodological Overview 19 3.2.2 Image Caption Model 21 3.2.3 Caption Generator 23 3.3 Fine-tune Strategy 24 3.3.1 Quantized Low-Rank Adaptation (QLoRA) 25 3.3.2 bfloat16 Precision 25 3.3.3 Flash Attention 26 3.3.4 Mixed Precision Training 27 3.3.5 Instruction Tuning 28 3.4 Dataset 29 3.5 Evaluation Procedure 31 3.6 Fine-Tuning Process, Hyperparameters, and Hardware Configuration 31 3.7 Benchmarks 33 3.8 Evaluation Metrics 36 3.8.1 CLIPScore (Hessel et al., 2021) 36 3.8.2 LLM-Eval (Lin and Chen, 2023) 38 3.8.3 Percentage of Hashtags and Percentage of Tagged Usernames 40 3.8.4 Semantic Similarity 41 3.8.5 Human Evaluation 43 3.8.5.1 Human Evaluation Procedure 44 3.8.5.2 Rationale for Criteria and Model Selection 45 Chapter 4 Evaluation and Analysis 47 4.1 Evaluation Metrics 47 4.1.1 English Dataset 47 4.1.2 Traditional Chinese Dataset 50 4.2 Results Analysis 52 Chapter 5 Conclusion 60 5.1 Conclusion 60 5.2 Limitations 61 5.3 Future Research Directions 62 References 64 | - |
| dc.language.iso | en | - |
| dc.subject | 微調 | zh_TW |
| dc.subject | 大型語言模型 | zh_TW |
| dc.subject | 社群媒體 | zh_TW |
| dc.subject | 網紅行銷 | zh_TW |
| dc.subject | 業配 | zh_TW |
| dc.subject | Sponsorship | en |
| dc.subject | Influencer Marketing | en |
| dc.subject | Social Media | en |
| dc.subject | Finetuning | en |
| dc.subject | Large Language Model Model | en |
| dc.title | 微調大型語言模型的業配文貼文生成之應用:基於英文與繁體中文數據集的洞悉 | zh_TW |
| dc.title | Exploring Fine-Tuned LLMs for Effective Sponsored Post Generation: Insights from English and Traditional Chinese Data | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 陳以錚;楊立偉;陳建錦 | zh_TW |
| dc.contributor.oralexamcommittee | Yi-Cheng Chen;Li-wei Yang;Chien-Chin Chen | en |
| dc.subject.keyword | 大型語言模型,微調,業配,網紅行銷,社群媒體, | zh_TW |
| dc.subject.keyword | Large Language Model Model,Finetuning,Sponsorship,Influencer Marketing,Social Media, | en |
| dc.relation.page | 73 | - |
| dc.identifier.doi | 10.6342/NTU202403184 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2024-08-06 | - |
| dc.contributor.author-college | 管理學院 | - |
| dc.contributor.author-dept | 資訊管理學系 | - |
| 顯示於系所單位: | 資訊管理學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 1.89 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
