Content of Special Issue on Large Language Models and Knowledge Graphs in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Contrast Research of Representation Learning in Entity Alignment Based on Graph Neural Network
    PENG Huang, ZENG Weixin, ZHOU Jie, TANG Jiuyang, ZHAO Xiang
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2343-2357.   DOI: 10.3778/j.issn.1673-9418.2307053
    Entity alignment is an important step in knowledge fusion, which aims to identify equivalent entities in different knowledge graphs. In order to accurately determine the equivalent entities, the existing methods first perform representation learning to map the entities into a low-dimensional vector space, and then infer the equivalence of the entities by the similarity between the vectors. Recent works on entity alignment focus on the improvement of representation learning methods. In order to better understand the mechanism of these models, mine valuable design directions, and provide reference for subsequent optimization and improvement work, this paper reviews the research on representation learning methods for entity alignment. Firstly, based on the existing methods, a general framework for representation learning is proposed, and several representative works are summarized and analyzed. Then, these works are compared and analyzed through experiments, and the common methods of each module in the framework are compared. Through the results, the advantages and disadvantages of various methods are summarized, and the use suggestions are put forward. Finally, the feasibility of the alignment and fusion of large language models and knowledge graphs is preliminarily discussed, and the existing problems and challenges are analyzed.
    Reference | Related Articles | Metrics
    Abstract286
    PDF302
    Survey of Causal Inference for Knowledge Graphs and Large Language Models
    LI Yuan, MA Xinyu, YANG Guoli, ZHAO Huiqun, SONG Wei
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2358-2376.   DOI: 10.3778/j.issn.1673-9418.2307065
    In recent decades, causal inference has been a significant research topic in various fields, including statistics, computer science, education, public policy, and economics. Most causal inference methods focus on the analysis of sample observational data and text corpora. However, with the emergence of various knowledge graphs and large language models, causal inference tailored to knowledge graphs and large models has gradually become a research hotspot. In this paper, different causal inference methods are classified based on their orientation towards sample observational data, text data, knowledge graphs, and large language models. Within each classification, this paper provides a detailed analysis of classical research works, including their problem definitions, solution methods, contributions, and limitations. Additionally, this paper places particular emphasis on discussing recent advancements in the integration of causal inference methods with knowledge graphs and large language models. Various causal inference methods are analyzed and compared from the perspectives of efficiency and cost, and specific applications of knowledge graphs and large language models in causal inference tasks are summarized. Finally, future development directions of causal inference in combination with knowledge graphs and large models are prospected.
    Reference | Related Articles | Metrics
    Abstract1059
    PDF1219
    Research on Question Answering System on Joint of Knowledge Graph and Large Language Models
    ZHANG Heyi, WANG Xin, HAN Lifan, LI Zhao, CHEN Zirui, CHEN Zhe
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2377-2388.   DOI: 10.3778/j.issn.1673-9418.2308070
    The large language model (LLM), including ChatGPT, has shown outstanding performance in understanding and responding to human instructions, and has a profound impact on natural language question answering (Q&A). However, due to the lack of training in the vertical field, the performance of LLM in the vertical field is not ideal. In addition, due to its high hardware requirements, training and deploying LLM remains difficult. In order to address these challenges, this paper takes the application of traditional Chinese medicine formulas as an example, collects the domain related data and preprocesses the data. Based on LLM and knowledge graph, a vertical domain Q&A system is designed. The system has the following capabilities: (1) Information filtering. Filter out vertical domain related questions and input them into LLM to answer. (2) Professional Q&A. Generate answers with more professional knowledge based on LLM and self-built knowledge base. Compared with the fine-tuning method of introducing professional data, using this technology can deploy large vertical domain models without the need for retraining. (3) Extract conversion. By strengthening the information extraction ability of LLM and utilizing its generated natural language responses, structured knowledge is extracted and matched with a professional knowledge graph for professional verification. At the same time, structured knowledge can be transformed into readable natural language, achieving a deep integration of large models and knowledge graphs. Finally, the effect of the system is demonstrated and the performance of the system is verified from both subjective and objective perspectives through two experiments of subjective evaluation of experts and objective evaluation of multiple choice questions.
    Reference | Related Articles | Metrics
    Abstract2699
    PDF2619
    Named Entity Recognition Method of Large Language Model for Medical Question Answering System
    YANG Bo, SUN Xiaohu, DANG Jiayi, ZHAO Haiyan, JIN Zhi
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2389-2402.   DOI: 10.3778/j.issn.1673-9418.2307061
    In medical question answering systems, entity recognition plays a major role. Entity recognition based on deep learning has received more and more attention. However, in the medical question answering system, due to the lack of annotated training data, deep learning methods cannot well identify discontinuous and nested entities in medical text. Therefore, a large language model-based entity recognition application method is proposed, and it is applied to the medical problem system. Firstly, the dataset related to medical question answering is processed into text that can be analyzed and processed by a large language model. Secondly, the output of the large language model is classified, and different classifications are processed accordingly. Then, the input text is used for intent recognition, and  finally the results of entity recognition and intent recognition are sent to the medical knowledge graph for query, and the answer to the medical question and answer is obtained. Experiments are performed on 3 typical datasets and compared with several typical correlation methods. The results show that the method proposed in this paper performs better.
    Reference | Related Articles | Metrics
    Abstract590
    PDF445
    Differentiable Rule Extraction with Large Language Model for Knowledge Graph Reasoning
    PAN Yudai, ZHANG Lingling, CAI Zhongmin, ZHAO Tianzhe, WEI Bifan, LIU Jun
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2403-2412.   DOI: 10.3778/j.issn.1673-9418.2306049
    Knowledge graph (KG) reasoning is to predict missing entities or relationships in incomplete triples, complete structured knowledge, and apply to different downstream tasks. Different from black-box methods which are widely studied, such as methods based on representation learning, the method based on rule extraction achieves an interpretable reasoning paradigm by generalizing first-order logic rules from the KG. To address the gap between discrete symbolic space and continuous embedding space, a differentiable rule extracting method based on the large pre-trained language model (DRaM) is proposed, which integrates discrete first-order logical rules with continuous vector space. In view of the influence of atom sequences in first-order logic rules for the reasoning process, a large pre-trained language model is introduced to encode the reasoning process. The differentiable method DRaM, which integrates first-order logical rules, achieves good results in link prediction tasks on three knowledge graph datasets, Family, Kinship and UMLS, especially for the indicator Hits@10. Comprehensive experimental results show that DRaM can effectively solve the problems of differentiable reasoning on the KGs, and can extract first-order logic rules with confidences from the reasoning process. DRaM not only enhances the reasoning performance with the help of first-order logic rules, but also enhances the interpretability of the method.
    Reference | Related Articles | Metrics
    Abstract329
    PDF287
    Influence Evaluation of Telecom Fraud Case Types Based on ChatGPT
    PEI Bingsen, LI Xin, WU Yue
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2413-2425.   DOI: 10.3778/j.issn.1673-9418.2306044
    At present, telecommunications fraud crimes are on the rise, posing a serious threat to the safety of people??s property. In order to optimize anti-fraud strategies, objectively and accurately analyze the trends and characteristics of different types of telecommunications fraud cases, and determine the most influential criminal methods, a ChatGPT based telecommunications fraud case type impact assessment method is proposed. By utilizing a knowledge graph, the content of the case text is structured, and the methods of telecommunications fraud are quantified by taking the time of the incident, the amount involved, and the number of individuals involved as factors to evaluate the impact of the case. Firstly, ChatGPT is used to preprocess and extract knowledge from the text corpus of telecommu-nications fraud cases through multiple rounds of Q&A, in order to quickly and timely construct a case knowledge graph in the field of telecommunications fraud with low resources. Based on the knowledge graph, various factors such as incident time, amount involved, and the number of involved parties are statistically analyzed, and the impact of different types of cases is abstracted into influencing factors. The influencing factors are used to depict the trend and characteristics of incidents, to conduct comprehensive analysis and judgment. This paper analyzes existing case data and calculates the impact factors of case types, obtaining the changes in impact factors of different case types, verifying the scientific and effective calculation methods of impact factors, and providing a new method for the evaluation of telecommunications fraud types. Combining the advantages of ChatGPT and knowledge graph helps to timely grasp the trend of case development and changes, provides strong support and guidance to combat teleco-mmunications fraud, and is of great significance for protecting public property safety and social stability.
    Reference | Related Articles | Metrics
    Abstract360
    PDF273