Most Download articles

    Published in last 1 year| In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Survey of Multimodal Data Fusion Research
    ZHANG Hucheng, LI Leixiao, LIU Dongjiang
    Journal of Frontiers of Computer Science and Technology    2024, 18 (10): 2501-2520.   DOI: 10.3778/j.issn.1673-9418.2403083
    Although the powerful learning ability of deep learning has achieved excellent results in the field of single-modal applications, it has been found that the feature representation of a single modality is difficult to fully contain the complete information of a phenomenon. In order to break through the obstacles of feature representation on a single modality and make greater use of the value contained in multiple modalities, scholars have begun to propose the use of multimodal fusion to improve model learning performance. Multimodal fusion technology is to make the machine use the correlation and complementarity between modalities to fuse into a better feature representation in text, speech, image and video, which provides a basis for model training. At present, the research of multimodal fusion is still in the early stage of development. This paper starts from the hot research field of multimodal fusion in recent years, and expounds the multimodal fusion method and the multimodal alignment technology in the fusion process. Firstly, the application, advantages and disadvantages of joint fusion method, cooperative fusion method, encoder fusion method and split fusion method in multimodal fusion are analyzed. The problem of multimodal alignment in the fusion process is expounded, including explicit alignment and implicit alignment, as well as the application, advantages and disadvantages. Secondly, it expounds the application of popular datasets in multimodal fusion in different fields in recent years. Finally, the challenges and research prospects of multimodal fusion are expounded to further promote the development and application of multimodal fusion.
    Reference | Related Articles | Metrics
    Abstract2045
    PDF1299
    Deep Learning-Based Infrared and Visible Image Fusion: A Survey
    WANG Enlong, LI Jiawei, LEI Jia, ZHOU Shihua
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 899-915.   DOI: 10.3778/j.issn.1673-9418.2306061
    How to preserve the complementary information in multiple images to represent the scene in one image is a challenging topic. Based on this topic, various image fusion methods have been proposed. As an important branch of image fusion, infrared and visible image fusion (IVIF) has a wide range of applications in segmentation, target detection and military reconnaissance fields. In recent years, deep learning has led the development direction of image fusion. Researchers have explored the field of IVIF using deep learning. Relevant experimental work has proven that applying deep learning to achieving IVIF has significant advantages compared with traditional methods. This paper provides a detailed analysis on the advanced algorithms for IVIF based on deep learning. Firstly, this paper reports on the current research status from the aspects of network architecture, method innovation, and limitations. Secondly, this paper introduces the commonly used datasets in IVIF methods and provides the definition of commonly used evaluation metrics in quantitative experiments. Qualitative and quantitative evaluation experiments of fusion and segmentation and fusion efficiency analysis experiments are conducted on some representative methods mentioned in the paper to comprehensively evaluate the performance of the methods. Finally, this paper provides conclusions and prospects for possible future research directions in the field.
    Reference | Related Articles | Metrics
    Abstract797
    PDF653
    Review of Attention Mechanisms in Reinforcement Learning
    XIA Qingfeng, XU Ke'er, LI Mingyang, HU Kai, SONG Lipeng, SONG Zhiqiang, SUN Ning
    Journal of Frontiers of Computer Science and Technology    2024, 18 (6): 1457-1475.   DOI: 10.3778/j.issn.1673-9418.2312006
    In recent years, the combination of reinforcement learning and attention mechanisms has attracted an increasing attention in algorithmic research field. Attention mechanisms play an important role in improving the performance of algorithms in reinforcement learning. This paper mainly focuses on the development of attention mechanisms in deep reinforcement learning and examining their applications in the multi-agent reinforcement learning domain. Relevant researches are conducted accordingly. Firstly, the background and development of attention mechanisms and reinforcement learning are introduced, and relevant experimental platforms in this field are also presented. Secondly, classical algorithms of reinforcement learning and attention mechanisms are reviewed and attention mechanism is categorized from different perspectives. Thirdly, practical applications of attention mechanisms in the reinforcement field are sorted out based on three types of tasks including fully cooperative, fully competitive and mixed, with focus on the application in the field of multi-agent. Finally, the improvement of attention mechanisms on reinforcement learning algorithms is summarized. The challenges and future prospects in this field are discussed.
    Reference | Related Articles | Metrics
    Abstract686
    PDF638
    Review of Research on 3D Reconstruction of Dynamic Scenes
    SUN Shuifa, TANG Yongheng, WANG Ben, DONG Fangmin, LI Xiaolong, CAI Jiacheng, WU Yirong
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 831-860.   DOI: 10.3778/j.issn.1673-9418.2305016
    As static scene 3D reconstruction algorithms become more mature, dynamic scene 3D reconstruction has become a hot and challenging research topic in recent years. Existing static scene 3D reconstruction algorithms have good reconstruction results for stationary objects. However, when objects in the scene undergo deformation or relative motion, their reconstruction results are not ideal. Therefore, developing research on 3D reconstruction of dynamic scenes is essential. This paper first introduces the related concepts and basic knowledge of 3D reconstruction, as well as the research classification and current status of static and dynamic scene 3D reconstruction. Then, the latest research progress on dynamic scene 3D reconstruction is comprehensively summarized, and the reconstruction algorithms are classified into dynamic 3D reconstruction based on RGB data sources and dynamic 3D reconstruction based on RGB-D data sources. RGB data sources can be further divided into template based dynamic 3D reconstruction, non rigid motion recovery structure based dynamic 3D reconstruction, and learning based dynamic 3D reconstruction under RGB data sources. The RGB-D data source mainly summarizes dynamic 3D reconstruction based on learning, with typical examples introduced and compared. The applications of dynamic scene 3D reconstruction in medical, intelligent manufacturing, virtual reality and augmented reality, and transportation fields are also discussed. Finally, future research directions for dynamic scene 3D reconstruction are proposed, and an outlook on the research progress in this rapidly developing field is presented.
    Reference | Related Articles | Metrics
    Abstract672
    PDF542
    Research Progress in Application of Deep Learning in Animal Behavior Analysis
    SHEN Tong, WANG Shuo, LI Meng, QIN Lunming
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 612-626.   DOI: 10.3778/j.issn.1673-9418.2306033
    In recent years, animal behavior analysis has become one of the most important methods in the fields of neuroscience and artificial intelligence. Taking advantage of the powerful deep-learning-based image analysis technology, researchers have developed state-of-the-art automatic animal behavior analysis methods with complex functions. Compared with traditional methods of animal behavior analysis, special labeling is not required in these methods, animal pose can be efficiently estimated and tracked. These methods like in a natural environment, which hold the potential for complex animal behavior experiments. Therefore, the application of deep learning in animal behavior analysis is reviewed. Firstly, this paper analyzes the tasks and current status of animal behavior analysis. Then, it highlights and compares existing deep learning-based animal behavior analysis tools. According to the dimension of experimental analysis, the deep learning-based animal behavior analysis tools are divided into two-dimensional animal behavior analysis tools and three-dimensional animal behavior analysis tools, and the functions, performance and scope of application of tools are discussed. Furthermore, the existing animal datasets and evaluation metrics are introduced, and the algorithm mechanism used in the existing animal behavior analysis tool is summarized from the advantages, limitations and applicable scenarios. Finally, the deep learning-based animal behavior analysis tools are prospected from the aspects of dataset, experimental paradigm and low latency.
    Reference | Related Articles | Metrics
    Abstract620
    PDF529
    Survey on Deep Learning in Oriented Object Detection in Remote Sensing Images
    LAN Xin, WU Song, FU Boyi, QIN Xiaolin
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 861-877.   DOI: 10.3778/j.issn.1673-9418.2308031
    The objects in remote sensing images have the characteristics of arbitrary direction and dense arrangement, and thus objects can be located and separated more precisely by using inclined bounding boxes in object detection task. Nowadays, oriented object detection in remote sensing images has been widely applied in both civil and military defense fields, which shows great significance in the research and application, and it has gradually become a research hotspot. This paper provides a systematic summary of oriented object detection methods in remote sensing images. Firstly, three widely-used representations of inclined bounding boxes are summarized. Then, the main challenges faced in supervised learning are elaborated from four aspects: feature misalignment, boundary discontinuity, inconsistency between metric and loss and oriented object location. Next, according to the motivations and improved strategies of different methods, the main ideas and advantages and disadvantages of each algorithm are analyzed in detail, and the overall framework of oriented object detection in remote sensing images is summarized. Furthermore, the commonly used oriented object detection datasets in remote sensing field are introduced. Experimental results of classical methods on different datasets are given, and the performance of different methods is evaluated. Finally, according to the challenges of deep learning applied to oriented object detection in remote sensing images tasks, the future research trend in this direction is prospected.
    Reference | Related Articles | Metrics
    Abstract491
    PDF508
    Survey of Development of YOLO Object Detection Algorithms
    XU Yanwei, LI Jun, DONG Yuanfang, ZHANG Xiaoli
    Journal of Frontiers of Computer Science and Technology    2024, 18 (9): 2221-2238.   DOI: 10.3778/j.issn.1673-9418.2402044
    In recent years, deep learning-based object detection algorithms have been a hot topic in computer vision research, with the YOLO (you only look once) algorithm standing out as an excellent object detection algorithm. The evolution of its network architecture has played a crucial role in improving detection speed and accuracy. This paper conducts a comprehensive horizontal analysis of the overall frameworks of YOLOv1 to YOLOv9, comparing  the network architecture (backbone network, neck layers and head layers) and loss functions. The strengths and limitations of different improvement methods are thoroughly discussed, with a specific evaluation of the impact of these improvements on model accuracy. This paper also delves into discussions on dataset selection and construction methods, the rationale behind choosing different evaluation metrics, and their applicability and limitations in various application scenarios. It further explores specific improvement methods for YOLO algorithm in five application domains (industrial, transportation, remote sensing, agriculture, biology), and discusses the balance among detection speed, accuracy, and complexity in these application domains. Finally, this paper analyzes the current development status of YOLO in various fields, summarizes existing issues in YOLO algorithm research through specific examples, and in conjunction with the trends in application domains, provides an outlook on the future of the YOLO algorithm. It also offers detailed explanations for four future research directions of YOLO (multi-task learning, edge computing, multimodal integration, virtual and augmented reality technology).
    Reference | Related Articles | Metrics
    Abstract499
    PDF480
    Construction and Application of Knowledge Graph for Water Engineering Scheduling Based on Large Language Model
    FENG Jun, CHANG Yanghong, LU Jiamin, TANG Hailin, LYU Zhipeng, QIU Yuchun
    Journal of Frontiers of Computer Science and Technology    2024, 18 (6): 1637-1647.   DOI: 10.3778/j.issn.1673-9418.2311098
    With the growth of water conservancy and the increasing demand for information, handling and representing large volumes of water-related data has become complex. Particularly, scheduling textual data often exists in natural language form, lacking clear structure and standardization. Processing and utilizing such diverse data necessitates extensive domain knowledge and professional expertise. To tackle this challenge, a method based on large language model has been proposed to construct a knowledge graph for water engineering scheduling. This approach involves collecting and preprocessing scheduling rule data at the data layer, leveraging large language models to extract embedded knowledge, constructing the ontology at the conceptual layer, and extracting the “three-step” method prompt strategy at the instance layer. Under the interaction of the data, conceptual, and instance layers, high-performance extraction of rule texts is achieved, and the construction of the dataset and knowledge graph is completed. Experimental results show that the F1 value of the extraction method in this paper reaches 85.5%, and the effectiveness and rationality of the modules of the large language model are validated through ablation experiments. This graph integrates dispersed water conservancy rule information, effectively handles unstructured textual data, and offers visualization querying and functionality tracing. It aids professionals in assessing water conditions and selecting appropriate scheduling schemes, providing valuable support for conservancy decision-making and intelligent reasoning.
    Reference | Related Articles | Metrics
    Abstract457
    PDF466
    Review of Application of Generative Adversarial Networks in Image Restoration
    GONG Ying, XU Wentao, ZHAO Ce, WANG Binjun
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 553-573.   DOI: 10.3778/j.issn.1673-9418.2307073
    With the rapid development of generative adversarial networks, many image restoration problems that are difficult to solve based on traditional methods have gained new research approaches. With its powerful generation ability, generative adversarial networks can restore intact images from damaged images, so they are widely used in image restoration. In order to summarize the relevant theories and research on the problem of using generative adversarial networks to repair damaged images in recent years, based on the categories of damaged images and their adapted repair methods, the applications of image restoration are divided into three main aspects: image inpainting, image deblurring, and image denoising. For each aspect, the applications are further subdivided through technical principles, application objects and other dimensions. For the field of image inpainting, different image completion methods based on generative adversarial networks are discussed from the perspectives of using conditional guidance and latent coding. For the field of image deblurring, the essential differences between motion blurred images and static blurred images and their repair methods are explained. For the field of image denoising, personalized denoising methods for different categories of images are summarized. For each type of applications, the characteristics of the specific GAN models employed are summarized. Finally, the advantages and disadvantages of GAN applied to image restoration are summarized, and the future application scenarios are prospected.
    Reference | Related Articles | Metrics
    Abstract447
    PDF446
    Overview of Cross-Chain Identity Authentication Based on DID
    BAI Yirui, TIAN Ning, LEI Hong, LIU Xuefeng, LU Xiang, ZHOU Yong
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 597-611.   DOI: 10.3778/j.issn.1673-9418.2304003
    With the emergence of concepts such as metaverse and Web3.0, blockchain plays a very important role in many fields. Cross-chain technology is an important technical means to achieve inter-chain interconnection and value transfer. At this stage, traditional cross-chain technologies such as notary and sidechain have trust issues. At the same time, in the field of cross-chain identity authentication, there are problems that the identities of each chain are not unified and users do not have control over their own identities. Firstly, it systematically summarizes the development process and technical solutions of digital identity and cross-chain technology, and analyzes and compares four digital identity models and nine mainstream cross-chain projects. Secondly, by analyzing the main research results of cross-chain identity authentication in recent years, a general model of cross-chain identity authentication is designed, and the shortcomings of existing solutions are summarized. Then, it focuses on the cross-chain identity authentication implementation scheme based on DID, and analyzes the technical characteristics, advantages and disadvantages of different solutions. On this basis, three DID-based cross-chain identity authentication models are summarized, the main implementation steps are functionally described, and their advantages, limitations and efficiency are analyzed. Finally, in view of the shortcomings of the current DID-based cross-chain identity authentication model, its development difficulties are discussed and five possible future research directions are given.
    Reference | Related Articles | Metrics
    Abstract501
    PDF428
    Survey of Transformer-Based Single Image Dehazing Methods
    ZHANG Kaili, WANG Anzhi, XIONG Yawei, LIU Yun
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1182-1196.   DOI: 10.3778/j.issn.1673-9418.2307103
    As a fundamental computer vision task, image dehazing aims to preprocess degraded images by restoring color contrast and texture information to improve visibility and image quality, thereby the clear images can be recovered for subsequent high-level visual tasks, such as object detection, tracking, and object segmentation. In recent years, neural network-based dehazing methods have achieved notable success, with a growing number of Transformer-based dehazing approaches being proposed. Up to now, there is a lack of comprehensive review that thoroughly analyzes Transformer-based image dehazing algorithms. To fill this gap, this paper comprehensively sorts out Transformer-based daytime, nighttime and remote sensing image dehazing algorithms, which not only covers the fundamental principles of various types of dehazing algorithms, but also explores the applicability and performance of these algorithms in different scenarios. In addition, the commonly used datasets and evaluation metrics in image dehazing tasks are introduced. On this basis, analysis of the performance of existing representative dehazing algorithms is carried out from both quantitative and qualitative perspectives, and the performance of typical dehazing algorithms in terms of dehazing effect, operation speed, resource consumption is compared. Finally, the application scenarios of image dehazing technology are summarized, and the challenges and future development directions in the field of image dehazing are analyzed and prospected.
    Reference | Related Articles | Metrics
    Abstract465
    PDF428
    Critical Review of Multi-focus Image Fusion Based on Deep Learning Method
    LI Ziqi, SU Yuxuan, SUN Jun, ZHANG Yonghong, XIA Qingfeng, YIN Hefeng
    Journal of Frontiers of Computer Science and Technology    2024, 18 (9): 2276-2292.   DOI: 10.3778/j.issn.1673-9418.2306058
    Multi-focus image fusion is an effective image fusion technology, which aims to combine source images from different focal planes of the same scene to obtain a good fusion result. This means that the fused image will focus on all focal planes, that is, it contains more abundant scene information. The development of deep learning promotes the great progress of image fusion, and the powerful feature extraction and reconstruction ability of neural network makes the fusion result promising. In recent years, more and more multi-focus image fusion methods based on deep learning have been proposed, such as convolutional neural network (CNN), generative adversarial network (GAN) and automatic encoder, etc. In order to provide effective reference for relevant researchers and technicians, firstly, this paper introduces the concept of multi-focus image fusion and some evaluation indicators. Then, it analyzes more than ten advanced methods of multi-focus image fusion based on deep learning in recent years, discusses the characteristics and innovation of various methods, and summarizes their advantages and disadvantages. In addition, it reviews the application of multi-focus image fusion technology in various scenes, including photographic visualization, medical diagnosis, remote sensing detection and other fields. Finally, it proposes some challenges faced by current multi-focus image fusion related fields and looks forward to future possible research trends.
    Reference | Related Articles | Metrics
    Abstract533
    PDF418
    Survey of AI Painting
    ZHANG Zeyu, WANG Tiejun, GUO Xiaoran, LONG Zhilei, XU Kui
    Journal of Frontiers of Computer Science and Technology    2024, 18 (6): 1404-1420.   DOI: 10.3778/j.issn.1673-9418.2401075
    AI painting, as a popular research direction in the field of computer vision, is expanding its application boundaries in the fields of art creation, film and media, industrial design, and art education through natural language processing, graphic pre-training models, and diffusion models. Two types of AI painting, namely, image-to-image and text-to-image, are taken as the main lines, and the representative models and their key technologies and methods are analyzed in depth. For the image-to-image, the development lineage, generation principle, and advantages and disadvantages of each model are explored from two types of models based on AE and GAN, and their effects on the public dataset are summarized. For the text-to-image, the structural differences of the three types of models based on diffusion model and other models, as well as the generation effects of various types of models on three datasets are summarized. It is pointed out that the text-to-image utilizing the diffusion model has become a hot topic nowadays, which predicts the diversified development of image generation in the future. And the current mainstream AI painting platforms are compared and summarized from the perspectives of usage and generation speed. Finally, on the basis of summarizing the problems and controversies faced by AI painting at the technical and social levels, future trends such as the complementary development of AI painting and human artists, the increased interactivity of the painting process, and the emergence of new professions and industries are envisioned.
    Reference | Related Articles | Metrics
    Abstract508
    PDF404
    Survey on Natural Scene Text Recognition Methods of Deep Learning
    ZENG Fanzhi, FENG Wenjie, ZHOU Yan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1160-1181.   DOI: 10.3778/j.issn.1673-9418.2306024
    Natural scene text recognition holds significant value in both academic research and practical applications, making it one of the research hotspots in the field of computer vision. However, the recognition process faces challenges such as diverse text styles and complex background environments, leading to unsatisfactory efficiency and accuracy. Traditional text recognition methods based on manually designed features have limited representation capabilities, which are insufficient for effectively handling complex tasks in natural scene text recognition. In recent years, significant progress has been made in natural scene text recognition by adopting deep learning methods. This paper systematically reviews the recent research work in this area. Firstly, the natural scene text recognition methods are categorized into segmentation-based and non-segmentation-based approaches based on character segmentation required or not. The non-segmentation-based methods are further subdivided according to their technical implementation characteristics, and the working principles of the most representative methods in each category are described. Next, commonly used datasets and evaluation metrics are introduced, and the performance of various methods is compared on these datasets. The advantages and limitations of different approaches are discussed from multiple perspectives. Finally, the shortcomings and challenges are given, and the future development trends are also put forward.
    Reference | Related Articles | Metrics
    Abstract383
    PDF386
    Review of Self-supervised Learning Methods in Field of ECG
    HAN Han, HUANG Xunhua, CHANG Huihui, FAN Haoyi, CHEN Peng, CHEN Jijia
    Journal of Frontiers of Computer Science and Technology    2024, 18 (7): 1683-1704.   DOI: 10.3778/j.issn.1673-9418.2310043
    Deep learning has been widely applied in the field of electrocardiogram (ECG) signal analysis due to its powerful data representation capability. However, supervised methods require a large amount of labeled data, and ECG data annotation is typically time-consuming and costly. Additionally, supervised methods are limited by the finite data types in the training set, resulting in limited generalization performance. Therefore, how to leverage massive unlabeled ECG signals for data mining and universal feature representation has become an urgent problem to be addressed. Self-supervised learning (SSL) is an effective approach to address the issue of missing annotated ECG data and improve the transfer ability of the model by learning generalized features from unlabeled data using pre-defined proxy tasks. However, existing surveys on self-supervised learning mostly focus on the domains of images or temporal signals, and there is a relative lack of comprehensive reviews on self-supervised learning in the ECG domain. To fill this gap, this paper provides a comprehensive review of advanced self-supervised learning methods used in the field of ECG. Firstly, a systematic summary and classification of self-supervised learning methods for ECG are presented, starting from two learning paradigms—contrastive and predictive. The basic principles of different categories of methods are elaborated, and the characteristics of each method are analyzed in detail, highlighting the advantages and limitations of each approach. Subsequently, a summary is provided for the commonly used datasets and application scenarios in ECG self-supervised learning, along with a review of data augmentation methods frequently applied in the ECG domain, offering a systematic reference for subsequent research. Finally, an in-depth discussion is presented on the current challenges of self-supervised learning within the ECG field, and future directions for the development of ECG self-supervised learning are explored, providing guidance for subsequent research in the field.
    Reference | Related Articles | Metrics
    Abstract268
    PDF382
    Review of Research on Rolling Bearing Health Intelligent Monitoring and Fault Diagnosis Mechanism
    WANG Jing, XU Zhiwei, LIU Wenjing, WANG Yongsheng, LIU Limin
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 878-898.   DOI: 10.3778/j.issn.1673-9418.2307005
    As one of the most critical and failure-prone parts of the mechanical systems of industrial equipment, bearings are subjected to high loads for long periods of time. When they fail or wear irreversibly, they may cause accidents or even huge economic losses. Therefore, effective health monitoring and fault diagnosis are of great significance to ensure safe and stable operation of industrial equipment. In order to further promote the development of bearing health monitoring and fault diagnosis technology, the current existing models and methods are analyzed and summarized, and the existing technologies are divided and compared. Starting from the distribution of vibration signal data used, firstly, the relevant methods under uniform data distribution are sorted out, the classification, analysis and summary of the current research status are carried out mainly according to signal-based analysis and data-driven-based, and the shortcomings and defects of the fault detection methods in this case are outlined. Secondly, considering the problem of uneven data acquisition under actual working conditions, the detection methods for dealing with such cases are summarized, and different processing techniques for this problem in existing research are classified into data processing methods, feature extraction methods, and model improvement methods according to their different focuses, and the existing problems are analyzed and summarized. Finally, the challenges and future development directions of bearing fault detection in existing industrial equipment are summarized and prospected.
    Reference | Related Articles | Metrics
    Abstract750
    PDF367
    Survey of Research on SMOTE Type Algorithms
    WANG Xiaoxia, LI Leixiao, LIN Hao
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1135-1159.   DOI: 10.3778/j.issn.1673-9418.2309079
    Synthetic minority oversampling technique (SMOTE) has become one of the mainstream methods for dealing with unbalanced data due to its ability to effectively deal with minority samples, and many SMOTE improvement algorithms have been proposed, but very little research existing considers popular algorithmic-level improvement methods. Therefore a more comprehensive analysis of existing SMOTE class algorithms is provided. Firstly, the basic principles of the SMOTE method are elaborated in detail, and then the SMOTE class algorithms are systematically analyzed mainly from the two levels of data level and algorithmic level, and the new ideas of the hybrid improvement of data level and algorithmic level are introduced. Data-level improvement is to balance the data distribution by deleting or adding data through different operations during preprocessing; algorithmic-level improvement will not change the data distribution, and mainly strengthens the focus on minority samples by modifying or creating algorithms. Comparison between these two kinds of methods shows that, data-level methods are less restricted in their application, and algorithmic-level improvements generally have higher algorithmic robustness. In order to provide more comprehensive basic research material on SMOTE class algorithms, this paper finally lists the commonly used datasets, evaluation metrics, and gives ideas of research in the future to better cope with unbalanced data problem.
    Reference | Related Articles | Metrics
    Abstract512
    PDF365
    Review of Research on Multi-agent Reinforcement Learning Algorithms
    LI Mingyang, XU Ke’er, SONG Zhiqiang, XIA Qingfeng, ZHOU Peng
    Journal of Frontiers of Computer Science and Technology    2024, 18 (8): 1979-1997.   DOI: 10.3778/j.issn.1673-9418.2401020
    In recent years, the technique of multi-agent reinforcement learning algorithm has been widely used in the field of artificial intelligence. This paper systematically analyses the multi-agent reinforcement learning algorithm, examines its application and progress in multi-agent systems, and explores the relevant research results in depth. Firstly, it introduces the research background and development history of multi-agent reinforcement learning and summarizes the existing relevant research results. Secondly, it briefly reviews the application of traditional reinforcement learning algorithms under different tasks. Then, it highlights the classification of multi-agent reinforcement learning algorithms and their application in multi-agent systems according to the three main types of tasks (path planning, pursuit and escape game, task allocation), challenges, and solutions. Finally, it explores the existing algorithm training environments in the field of multi-agents, summarizes the improvement of deep learning on multi-agent reinforcement learning algorithms, proposes challenges and looks forward to future research directions in this field.
    Reference | Related Articles | Metrics
    Abstract412
    PDF355
    Multi-strategy Improved Dung Beetle Optimizer and Its Application
    GUO Qin, ZHENG Qiaoxian
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 930-946.   DOI: 10.3778/j.issn.1673-9418.2308020
    Dung beetle optimizer (DBO) is an intelligent optimization algorithm proposed in recent years. Like other optimization algorithms, DBO also has disadvantages such as low convergence accuracy and easy to fall into local optimum. A multi-strategy improved dung beetle optimizer (MIDBO) is proposed. Firstly, it improves acceptance of local and global optimal solutions by brood balls and thieves, so that the beetles can dynamically change according to their own searching ability, which not only improves the population quality but also maintains the good searching ability of individuals with high fitness. Secondly, the follower position updating mechanism in the sparrow search algorithm is integrated to disturb the algorithm, and the greedy strategy is used to update the location, which improves the convergence accuracy of the algorithm. Finally, when the algorithm stagnates, Cauchy Gaussian variation strategy is introduced to improve the ability of the algorithm to jump out of the local optimal solution. Based on 20 benchmark test functions and CEC2019 test function, the simulation experiment verifies the effectiveness of the three improved strategies. The convergence analysis of the optimization results of the improved algorithm and the comparison algorithms and Wilcoxon rank sum test prove that MIDBO has good optimization performance and robustness. The validity and reliability of MIDBO in solving practical engineering problems are further verified by applying MIDBO to the solution of automobile collision optimization problems.
    Reference | Related Articles | Metrics
    Abstract461
    PDF350
    Advances of Adversarial Attacks and Robustness Evaluation for Graph Neural Networks
    WU Tao, CAO Xinwen, XIAN Xingping, YUAN Lin, ZHANG Shu, CUI Canyixing, TIAN Kan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (8): 1935-1959.   DOI: 10.3778/j.issn.1673-9418.2311117
    In recent years, graph neural networks (GNNs) have gradually become an important research direction in artificial intelligence. However, the adversarial vulnerability of GNNs poses severe challenges to their practical applications. To gain a comprehensive understanding of adversarial attacks and robustness evaluation on GNNs, related state-of-the-art advancements are reviewed and discussed. Firstly, this paper introduces the research background of adversarial attacks on GNNs, provides a formal definition of these attacks, and elucidates the basic concepts and framework for research on adversarial attacks and robustness evaluation in GNNs. Following this,  this paper gives an overview of the specific methods proposed in the field of adversarial attacks on GNNs, and details the foremost methods while categorizing them based on the type of adversarial attack and range of attack targets. Their operating mechanisms, principles, and pros and cons are also analyzed. Additionally, considering the model robustness evaluation's dependency on adversarial attack methods and adversarial perturbation degree,  this paper focuses on direct evaluation indicators. To aid in designing and evaluating adversarial attack methods and GNNs' robust models,  this paper compares representative methods considering implementation ease, accuracy, and execution time. This paper foresees ongoing challenges and future research areas. Current research on GNNs?? adversarial robustness is experiment-oriented, lacking a guiding theoretical framework, necessitating further systematic theoretical research to ensure GNN-based systems' trustworthiness.
    Reference | Related Articles | Metrics
    Abstract485
    PDF339
    Survey of Neural Machine Translation Based on Knowledge Distillation
    MA Chang, TIAN Yonghong, ZHENG Xiaoli, SUN Kangkang
    Journal of Frontiers of Computer Science and Technology    2024, 18 (7): 1725-1747.   DOI: 10.3778/j.issn.1673-9418.2311027
    Machine translation (MT) is the process of using a computer to convert one language into another language with the same semantics. With the introduction of neural network, neural machine translation (NMT), as a powerful machine translation technology, has achieved remarkable success in the field of automatic translation and artificial intelligence. Due to the problem of redundant parameters and structure in traditional neural translation models, knowledge distillation (KD) technology is proposed to compress the model and accelerate the inference of neural machine translation, which has attracted wide attention in the field of machine learning and natural language processing. This paper systematically investigates and compares various translation models with introduction of  know-ledge distillation from the perspectives of evaluation indicators and technical innovations. Firstly, this paper briefly reviews the development process, mainstream frameworks and evaluation indicators of machine translation. Secondly, the knowledge distillation technology is introduced in detail. Thirdly, the development direction of neural machine translation based on knowledge distillation is detailed from four perspectives: multi-language model, multi-modal translation, low-resource language, autoregressive and non-autoregressive, and the research status of other fields is briefly introduced. Finally, the problems of existing large language models, zero-resource languages and multi-modal machine translation are analyzed, and the development trend of neural machine translation is prospected.
    Reference | Related Articles | Metrics
    Abstract226
    PDF328
    Knowledge Graph Completion Algorithm with Multi-view Contrastive Learning
    QIAO Zifeng, QIN Hongchao, HU Jingjing, LI Ronghua, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1001-1009.   DOI: 10.3778/j.issn.1673-9418.2301038
    Knowledge graph completion is a process of reasoning new triples based on existing entities and relations in knowledge graph. The existing methods usually use the encoder-decoder framework. Encoder uses graph convolutional neural network to get the embeddings of entities and relations. Decoder calculates the score of each tail entity according to the embeddings of the entities and relations. The tail entity with the highest score is the inference result. Decoder inferences triples independently, without consideration of graph information. Therefore, this paper proposes a graph completion algorithm based on contrastive learning. This paper adds a multi-view contrastive learning framework into the model to constrain the embedded information at graph level. The comparison of multiple views in the model constructs different distribution spaces for relations. Different distributions of relations fit each other, which is more suitable for completion tasks. Contrastive learning constraints the embedding vectors of entity and subgraph and enhahces peroformance of the task. Experiments are carried out on two datasets. The results show that MRR is improved by 12.6% over method A2N and 0.8% over InteractE on FB15k-237 dataset, and 7.3% over A2N and 4.3% over InteractE on WN18RR dataset. Experimental results demonstrate that this model outperforms other completion methods.
    Reference | Related Articles | Metrics
    Abstract330
    PDF326
    Survey of AIGC Large Model Evaluation: Enabling Technologies, Vulnerabilities and Mitigation
    XU Zhiwei, LI Hailong, LI Bo, LI Tao, WANG Jiatai, XIE Xueshuo, DONG Zehui
    Journal of Frontiers of Computer Science and Technology    2024, 18 (9): 2293-2325.   DOI: 10.3778/j.issn.1673-9418.2402023
    Artificial intelligence generated content (AIGC) models have attracted widespread attention and application worldwide due to their excellent content generation capabilities. However, the rapid development of AIGC large models also brings a series of hidden dangers, such as concerns about interpretability, fairness, security, and privacy preservation of model-generated content. In order to reduce the unknowable risks and their harms, it becomes more and more important to carry out a comprehensive measurement and evaluation of AIGC large models. Academics have initiated AIGC large model evaluation studies aiming to effectively address the related challenges and avoid potential risks. This paper summarizes and analyzes the AIGC large model evaluation studies. Firstly, an overview of the model evaluation process is provided, covering model evaluation pre-preparation and corresponding measurement indicators, and existing measurement benchmarks are systematically organized. Secondly, the representative applications of the AIGC large model in finance, politics and healthcare and their problems are discussed. Then, the measurement methods are studied in depth through different perspectives, such as interpretability, fairness, robustness, security and privacy, and the new issues that need to be paid attention to AIGC large model evaluation are deconstructed, and the ways to cope with the new challenges of large model evaluation are proposed. Finally, the future challenges of AIGC large model evaluation are discussed, and its future development direction is envisioned.
    Reference | Related Articles | Metrics
    Abstract335
    PDF324
    Survey on Solving Cold Start Problem in Recommendation Systems
    MAO Qian, XIE Weicheng, QIAO Yitian, HUANG Xiaolong, DONG Gang
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1197-1210.   DOI: 10.3778/j.issn.1673-9418.2308044
    Recommender systems provide important functions in areas such as dealing with data overload, providing personalized consulting services, and assisting clients in investment decisions. However, the cold start problem in recommender systems has always been in urgent need of solution and optimization. Based on this, this paper classifies the traditional methods and cutting-edge methods to solve the cold start problem, and expounds the research progress and excellent methods in recent years. Firstly, three traditional solutions to the cold start problem are summarized: recommendation based on content filtering, recommendation based on collaborative filtering, and hybrid recommendation. Secondly, the current cutting-edge recommendation algorithms to solve the cold start problem are summarized, and they are classified into the data-driven strategy and the method-driven strategy. The method-driven strategy is divided into algorithms based on meta-learning, algorithms based on context information and session str-ategy, algorithms based on random walk, algorithms based on heterogeneous graph information and attribute graph, and algorithms based on adversarial mechanism. According to the type of cold start problem, the algorithms are divided into two categories: new users and new items. Then, according to the particularity of the recommendation field, the cold start problem of the recommendation in the multimedia information field and the online e-commerce platform field is expounded. Finally, the possible research directions to solve the cold start problem in the future are summarized.
    Reference | Related Articles | Metrics
    Abstract468
    PDF312
    MFFNet: Image Semantic Segmentation Network of Multi-level Feature Fusion
    WANG Yan, NAN Peiqi
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 707-717.   DOI: 10.3778/j.issn.1673-9418.2209110
    In the task of image semantic segmentation, most methods do not make full use of features of different scales and levels, but directly upsampling, which will cause some effective information to be dismissed as redundant information, thus reducing the accuracy and sensitivity of segmentation of some small categories and similar categories. Therefore, a multi-level feature fusion network (MFFNet) is proposed. MFFNet uses encoder-decoder structure, during the encoding stage, the context information and spatial detail information are obtained through the context information extraction path and spatial information extraction path respectively to enhance the inter-pixel correlation and boundary accuracy. During the decoding stage, a multi-level feature fusion path is designed, and the context information is fused by the mixed bilateral fusion module. Deep information and spatial information are fused by high-low feature fusion module. The global channel-attention fusion module is used to obtain the connections between different channels and realize global fusion of different scale information. The MIoU (mean intersection over union) of MFFNet network on the PASCAL VOC 2012 and Cityscapes validation sets is 80.70% and 76.33%, respectively, achieving better segmentation results.
    Reference | Related Articles | Metrics
    Abstract490
    PDF303
    Survey on Popularity Based Recommendation
    LEI Qinlan, TIAN Xuan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1109-1134.   DOI: 10.3778/j.issn.1673-9418.2309016
    Currently, popularity based recommendation has become a research hotspot. The use of popularity considerably improves the recommendation effects, while the Matthew effect caused by popularity bias has also garnered extensive attention among researchers. Some researchers consider combining both aspects to produce hybrid popularity based recommendation. Adopting the concept of popularity, a unified representation of popularity, popularity bias, and hybrid popularity is provided in this paper. Firstly, the background of popularity in the field of recommendation is introduced. Then, based on different perspectives, a comprehensive survey on popularity-enhanced recommendation methods, popularity debias recommendation methods, and hybrid popularity based recommendation methods is provided. Each type of method is further subdivided in specific subtasks of modeling or concrete strategies. The representative models of each method are introduced and analyzed, and their advantages and limitations are evaluated. The mechanisms and applicable scenarios of each method are also summarized in detail. Furthermore, the commonly used datasets, performance evaluation indicators and baseline are introduced. A comparative analysis of the representative methods performance is also listed. Finally, some opinions on the trends of popularity based recommendation are presented. An outlook on the technical difficulties and hotspots for future development from multiple perspectives is analyzed and predicted.
    Reference | Related Articles | Metrics
    Abstract230
    PDF301
    Survey of Entity Relationship Extraction Methods in Knowledge Graphs
    ZHANG Xishuo, LIU Lin, WANG Hailong, SU Guibin, LIU Jing
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 574-596.   DOI: 10.3778/j.issn.1673-9418.2305019
    Entity-relationship extraction has gained more and more attention from researchers as a basis for knowledge graph construction. Entity-relationship extraction can automatically and accurately obtain knowledge from a large amount of data, and represent and store it in a structured form. Therefore, the correctness of entity-relationship extraction directly affects the accuracy of knowledge graph construction and the effect of subsequent knowledge graph application. However, for different research hotspots such as complex structure, open domain, multi-language, multi-modal, small sample data, and joint extraction of entity-relationships, the existing entity-relationship extraction methods still have some limitations. Based on the current research hotspots of entity-relationship extraction, this paper tries to categorize entity-relationship extraction into six aspects: complex structure, open domain, multilingual, multimodal, small-sample data, and joint entity-relationship extraction, and categorizes each aspect according to the specific problems and lists out some solutions. Not only the current problems and solutions of each category are systematically sorted out, but the research results of each category are summarized, and the advantages and disadvantages of each method are analyzed in detail from the dimensions of quantitative analysis and qualitative analysis. Finally, the problems to be solved in the current hot areas are summarized, and the future development trend of entity-relationship extraction methods in the knowledge graph is also prospected.
    Reference | Related Articles | Metrics
    Abstract366
    PDF294
    Few-Shot Knowledge Graph Completion Based on Selective Attention
    LIN Sui, LU Chaohai, JIANG Wenchao, LIN Xiaoshan, ZHOU Weilin
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 646-658.   DOI: 10.3778/j.issn.1673-9418.2212076
    Most few-shot knowledge graph completion models have some problems, such as low ability to learn relation representation and rarely attaching importance to the relative location and interaction between query entity pair when the relation between entities is complex or triples’ neighborhood is sparse. A selective attention mechanism and interaction awareness (SAIA) based few-shot knowledge graph completion algorithm is proposed. Firstly, by introducing selective attention mechanism in the process of aggregating neighbor information, the neighbor encoder pays more attention to important neighbors to reduce adverse effects of noise neighbors. Secondly, SAIA utilizes the information related to task relation in the background knowledge graph to learn more accurate relation embedding in the process of relationship representation learning. Finally, in order to mine the interaction information and location information between entities in knowledge graph, a common interaction rate index (CIR) of entity pair is designed to measure the degree of association between entities in 3-hop path. Then, SAIA combines entity pair semantic information to predict new fact. Experimental results show that SAIA outperforms the state-of-the-art few-shot knowledge graph completion methods. Compared with the optimal results of baseline models, the proposed method achieves 5-shot link prediction performance improvement of 0.038, 0.011, 0.028 and 0.052 on NELL-one dataset and 0.034,0.037,0.029 and 0.027 on Wiki-one dataset by the metric MRR, Hits@10, Hits@5 as well as Hits@1, which verifies the effectiveness and feasibility of SAIA.
    Reference | Related Articles | Metrics
    Abstract226
    PDF291
    Review of U-Net-Based Convolutional Neural Networks for Breast Medical Image Segmentation
    PU Qiumei, YIN Shuai, LI Zhengmao, ZHAO Lina
    Journal of Frontiers of Computer Science and Technology    2024, 18 (6): 1383-1403.   DOI: 10.3778/j.issn.1673-9418.2307069
    U-Net and its variants have showcased exceptional performance in the domain of breast medical image segmentation. By employing a fully convolutional network (FCN) structure for semantic segmentation, the symmetrical structure of U-Net offers remarkable flexibility and adaptability. It can be tailored to diverse image segmentation tasks and challenges by adjusting network depth and incorporating new modules, leaving a significant impact on subsequent network designs. This paper aims to delve into the application of U-shaped convolutional networks in breast medical image segmentation, categorizing and summarizing U-shaped convolutional networks used for this purpose in recent years. It outlines the widely used breast medical image datasets and evaluation metrics, discusses common data augmentation techniques, and provides a detailed introduction to the network structure of the U-Net model along with traditional segmentation methods for breast medical images. Furthermore, it summarizes the improvements made to the U-Net network structure for breast medical image segmentation, including modifications like residual structures, multi-scale features, dilation mechanisms, attention mechanisms, skip connection mechanisms, and integration with Transformers. Finally, it addresses the current challenges and problems encountered in breast medical image segmentation and offers insights into future research directions.
    Reference | Related Articles | Metrics
    Abstract261
    PDF281
    Research on Construction and Application of Knowledge Graph Based on Large Language Model
    ZHANG Caike, LI Xiaolong, ZHENG Sheng, CAI Jiajun, YE Xiaozhou, LUO Jing
    Journal of Frontiers of Computer Science and Technology    2024, 18 (10): 2656-2667.   DOI: 10.3778/j.issn.1673-9418.2406013
    Massive amounts of operational and maintenance (O&M) data from nuclear power distributed control system (DCS) contain rich operational experience and expert knowledge. Effectively extracting DCS alarm response information and forming knowledge service is a current hotspot and frontier research area in rapid DCS response. Due to the lack of clear structure and standards in multi-source heterogeneous data of nuclear power DCS, previous knowledge extraction primarily relied on manual annotation and deep learning methods, which require extensive domain knowledge and information processing capabilities and are constrained by the heavy workload of data annotation. Therefore, this study proposes a knowledge extraction method using large language model (LLM) with a step-by-step prompting strategy, constructing a DCS O&M knowledge graph (KG). Based on large language model technology and secondary intent recognition methods, intelligent question and answer (Q&A) and other knowledge services are developed utilizing the knowledge graph. Using O&M data from a nuclear power plant’s DCS as a case study, the research focuses on knowledge extraction, knowledge graph construction, and intelligent Q&A. The results show that the model achieves an overall precision (P) of 91.24%, recall (R) of 85.85%, and F1-score of 88.43%. The proposed method can comprehensively capture key entities and attribute information from multi-source heterogeneous DCS O&M data, guiding domain knowledge Q&A, assisting O&M personnel in timely responding to DCS alarm anomalies, analyzing fault causes and response strategies, and providing guidance for DCS O&M training and maintenance in power plants.
    Reference | Related Articles | Metrics
    Abstract250
    PDF278