Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18332
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor項潔(Jieh Hsiang)
dc.contributor.authorKuan-Hung Chenen
dc.contributor.author陳冠宏zh_TW
dc.date.accessioned2021-06-08T01:00:09Z-
dc.date.copyright2020-08-21
dc.date.issued2020
dc.date.submitted2020-08-13
dc.identifier.citation[1] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. Sutskever, I. (2018). Language Models are Unsupervised Multitask Learners. , .
[2] J.Devlin, M.-W.Chang, K.Lee, K.Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proc. 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Vol. 1 (Long Short Pap., 2019: pp. 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/.
[3] Lupu, M. (2017). Information retrieval, machine learning, and Natural Language Processing for intellectual property information. World Patent Information, 49, A1–A3. https://doi.org/10.1016/j.wpi.2017.06.002
[4] Sun, Y., Wang, S., Li, Y., Feng, S., Chen, X., Zhang, H., Tian, X., Zhu, D., Tian, H., Wu, H. (2019). Ernie: Enhanced representation through knowledge integration. arXiv:1904.09223 [cs]. http://arxiv.org/abs/1904.09223
[5] Sennrich, R., Haddow, B., Birch, A. (2016). Improving neural machine translation models with monolingual data. arXiv:1511.06709 [cs]. http://arxiv.org/abs/1511.06709
[6] Lee, Jieh-Sheng Hsiang, Jieh. (2020). Patent classification by fine-tuning BERT language model. World Patent Information. 61. 101965. 10.1016/j.wpi.2020.101965.
[7] Lee, J.-S., Hsiang, J. (2019). Patent Claim Generation by Fine-Tuning OpenAI GPT-2. arXiv:1907.02052 [cs, stat]. http://arxiv.org/abs/1907.02052
[8] Lee, Jieh-Sheng. (2020). PatentTransformer: A Framework for Personalized Patent Claim Generation
[9] Lee, Jieh-Sheng Hsiang, Jieh. (2020). Measuring and Controlling Text Generation by Semantic Search: A Textual Similarity Approach for PatentTransformer. 10.1145/3366424.3382086.
[10] Sutskever, I., Vinyals, O., Le, Q. V. (2014). Sequence to sequence learning with neural networks. arXiv:1409.3215 [cs]. http://arxiv.org/abs/1409.3215
[11] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). Attention is all you need. arXiv:1706.03762 [cs]. http://arxiv.org/abs/1706.03762
[12] Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., Artzi, Y. (2020). Bertscore: Evaluating text generation with bert. arXiv:1904.09675 [cs]. http://arxiv.org/abs/1904.09675
[13] Zeyao Du. (2019). GPT2-Chinese: Tools for training GPT2 model in Chinese language. https://github.com/Morizeyao/GPT2-Chinese.
[14] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R'emi Louf, Morgan Funtowicz, Jamie Brew (2019). HuggingFace's Transformers: State-of-the-art Natural Language ProcessingArXiv, abs/1910.03771.https://github.com/huggingface/pytorchpretrained-BERT (accessed June3, 2019).
[15] Sun, Y., Wang, S., Li, Y., Feng, S., Chen, X., Zhang, H., Tian, X., Zhu, D., Tian, H., Wu, H. (2019). Ernie: Enhanced representation through knowledge integration arXiv preprint arXiv:1904.09223.
[16] Sellam, T., Das, D., Parikh, A. P. (2020). Bleurt: Learning robust metrics for text generation. arXiv:2004.04696 [cs]. http://arxiv.org/abs/2004.04696
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/18332-
dc.description.abstract專利申請是所有創新創意發明人所必須經歷的過程,透過專利申請才能保障發明者的正當權益,每一位創新發明者需向經濟部智慧財產局提出申請,只要經過經濟部智慧財產局的審核通過,即可保障專利權所有者在一定期間享有專有排除他人未經其同意而使用該方法及使用、販賣之要約,此法律權益對於發明者來說是一大保障。
然而,在專利申請的過程中,專利範圍的內容是影響專利是否通過的關鍵因素,因此許多發明人會透過專業代辦解決專利範圍的撰寫問題。為了幫助發明人可以快速完成專利的申請,以期能夠帶動台灣的創新創意發明,本研究旨在利用機器學習的方式,透過 GPT-2 智慧生成專利範圍,用以解決發明人對於專利範圍無法掌握撰寫要領的問題,生成的結果之 BERTscore 達到0.68。
本研究亦提出一套評估模型,可以對「專利摘要」與「專利範圍」的關係與品質進行評分,除了可以作為 GPT-2 生成模型之結果的衡量參考,也可以度量發明人撰寫之「專利範圍」的質量。
zh_TW
dc.description.abstractFilling a patent for the invention is an essential process for an inventor. Through legal guarantee, it protects the right which an inventor can possess. Each inventor can propose their patent application through Intellectual Property Office, Ministry of Economic Affairs. Once the application has been approved, the inventor may exclude illegal uses of the invention during the patent’s valid period. This legal protection somehow establishes a sound environment for innovations and creations.
A core factor that affects the outcome of a patent application is the patent claims. Therefore, many inventors may seek help from a professional agency to compose feasible patent claims. In order to assist inventors and boost up the efficiency of patent applications, this research focuses on utilizing machine learning and using the GPT-2 model to generate patent claims. The result achieves 0.68 in BERTscore.
We also propose an evaluation method to assess the relation level and quality between ‘patent abstract’ and ‘patent claim’. In addition to providing a metric for evaluating patent claims generated from the GPT-2 model, our method also enables inventors to evaluate the quality of their claims.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T01:00:09Z (GMT). No. of bitstreams: 1
U0001-1208202023032400.pdf: 1343401 bytes, checksum: f661d3aefbffcd4a70e0e2d4f22368ec (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents誌謝 ii
中文摘要 iii
Abstract iv
目錄 v
圖目錄 viii
表目錄 ix
第一章 緒論 1
1.1 研究動機 1
1.2 研究目的 3
1.3 論文架構 3
第二章 文獻探討 4
2.1 概述 4
2.2 自然語言處理 4
2.3 資料增強 5
2.4 專利範圍生成 6
第三章 專利資料型態與特徵 7
3.1. 概述 7
3.2. 專利說明書 7
3.3. 專利分類 7
3.4. 資料集說明 8
3.5. 專利說明書之欄位說明 8
3.6. 資料範例 9
第四章 研究方法 10
4.1 概述 10
4.2 符號定義 10
4.3 模組架構 11
4.4 資料預處理 11
4.5 專利範圍生成 15
4.5.1 LSTM seq2seq 15
4.5.2 GPT-2 16
4.6 評估分析 19
4.6.1 摘要與專利範圍之相關性與品質 19
4.6.2 「生成專利範圍」與「真實專利範圍」語意相似度22
4.7 資料增強 22
第五章 實驗結果與評估 23
5.1 概述 23
5.2 資料集切分 23
5.3 模型參數 23
5.4 實驗細節 24
5.4.1 斷詞工具 25
5.4.2 訓練過程 25
5.4.3 生成及後處理 25
5.5 生成結果範例 26
5.6 「生成專利範圍」與「真實專利範圍」的語意相似度 27
5.7 「專利摘要」與「專利範圍」的相關性與品質 28
第六章 結論與未來展望 31
6.1. 結論 31
6.2. 未來展望 31
參考文獻 33
dc.language.isozh-TW
dc.title以深度學習協助發明人生成專利範圍zh_TW
dc.titleAssisting Patent Claim Generation with Deep Learningen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee李界昇(Jieh-Sheng Lee),蔡宗翰(Tzong-Han Tsai),謝育平(Yu-Ping Hsieh)
dc.subject.keyword專利範圍生成,自然語言生成,評估度量,深度學習,GPT-2,zh_TW
dc.subject.keywordPatent Claim Generation,Text Generation,Evaluation,Deep Learning,GPT-2,en
dc.relation.page34
dc.identifier.doi10.6342/NTU202003167
dc.rights.note未授權
dc.date.accepted2020-08-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1208202023032400.pdf
  目前未授權公開取用
1.31 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved