A few-shot learning method based on knowledge graph in large language models

Wang, FeiLong; Shi, Donghui; Aguilar, Jose; Cui, Xinyi

doi:10.1007/s41060-024-00699-3

Ficheros

original version (885.0Kb)

Identificadores

URI: https://hdl.handle.net/20.500.12761/1886

ISSN: 2364-415X

DOI: 10.1007/s41060-024-00699-3

Metadatos

Mostrar el registro completo del ítem

Autor(es)

Wang, FeiLong; Shi, Donghui; Aguilar, Jose; Cui, Xinyi

Fecha

2024-12-15

Resumen

The emergence of large language models has significantly transformed natural language processing and text generation. Fine-tuning these models for specific domains enables them to generate answers tailored to the unique requirements of those fields, such as in legal or medical domains. However, these models often perform poorly in few-shot scenarios. Herein, the challenges of data scarcity in fine-tuning large language models in low-sample scenarios were addressed by proposing three different KDGI (Knowledge-Driven Dialog Generation Instances) generation strategies, including entity-based KDGI generation, relation-based KDGI generation, and semantic-based multi-level KDGI generation. These strategies aimed to enhance few-shot datasets to address the issue of low fine-tuning metrics caused by insufficient data. Specifically, knowledge graphs were utilized to define the distinct KDGI generation strategies for enhancing few-shot data. Subsequently, these KDGI data were employed to fine-tune the large language model using the P-tuning v2 approach. Through multiple experiments, the effectiveness of the three KDGI generation strategies was validated using BLEU and ROUGE metrics, and the fine-tuning benefits of few-shot learning on large language models were confirmed. To further evaluate the effectiveness of KDGI, additional experiments were conducted, including LoRA-based fine-tuning in the medical domain and comparative studies leveraging Mask Language Model augmentation, back-translation, and noise injection methods. Consequently, the paper proposes a reference method for leveraging knowledge graphs in prompt data engineering, which shows potential in facilitating few-shot learning for fine-tuning large language models.