Most Download articles

    Published in last 1 year| In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Overview of Knowledge Graph Construction and Reasoning Enhanced by Large Language Models
    ZHANG Jing, HUANG Wenfeng, WU Chunjiang, TAN Hao
    Journal of Frontiers of Computer Science and Technology    2025, 19 (11): 2855-2872.   DOI: 10.3778/j.issn.1673-9418.2503034
    With the widespread application of knowledge graphs (KGs) in fields such as intelligent question answering and recommender systems, the technical bottlenecks in large-scale construction and efficient reasoning have become increasingly prominent. Traditional manual or semi-automated construction approaches are costly, while issues such as entity disambiguation and relation extraction accuracy continue to hinder the quality of the resulting graphs. Furthermore, knowledge sparsity and the complexity of reasoning rules limit the generalization capability of KG reasoning. Large language models (LLMs), with their powerful semantic understanding and contextual modeling capabilities, offer promising new avenues to address these challenges. However, current research in this area lacks a systematic review, and the applicability and performance boundaries of various methods remain unclear. To bridge this gap, this paper provides a comprehensive survey of LLM-enhanced knowledge graph construction and reasoning methods. Firstly, this paper introduces the foundational theories of knowledge graphs and large language models. The survey then focuses on four core tasks: knowledge extraction, automated construction, knowledge completion, and reasoning. For knowledge extraction, this paper compares zero-shot extraction methods based on LLMs with domain-adapted extraction through fine-tuning. In terms of automated construction, this paper reviews techniques for LLM-driven ontology generation and iterative graph updates. For knowledge completion, this paper summarizes methods involving pseudo-triple generation via LLMs, prompt-based context planning, and the integration of external retrieval mechanisms. Regarding reasoning tasks, this paper analyzes both static LLM-augmented reasoning and actively planned reasoning approaches. This paper further presents typical application scenarios in domains such as healthcare and education, and compiles a list of general-purpose and domain-specific knowledge graph datasets in both English and Chinese that support research in this area. Finally, this paper highlights the current limitations of existing methods and proposes several future research directions.
    Reference | Related Articles | Metrics
    Abstract447
    PDF635
    Integrated Sensing, Communication and Computing: Key Technologies, Challenges, and Future Trends
    LIU Zhuang, WU Yuhe, CHEN Yuran, LIU Ruitong, DONG Yanning, ZHAO Jun
    Journal of Frontiers of Computer Science and Technology    2025, 19 (9): 2273-2301.   DOI: 10.3778/j.issn.1673-9418.2412035
    In the construction of a future highly integrated physical and digital world, the deep integration of communication, sensing, and computing has become a key technology for next-generation intelligent networks. Focusing on integrated sensing, communication and computing (ISCC) technology, this paper systematically analyzes its theoretical and practical value. Starting from technological evolution and emerging requirements, the paper clarifies the key role of ISCC in enhancing system intelligence, reducing latency, and optimizing resource utilization, particularly its necessity in meeting emerging business requirements such as immersive extended reality (XR), holographic communication, and autonomous driving. The paper deeply explores the core technical architecture of ISCC, including wireless sensing, multimodal sensing, mobile edge computing, and the deep fusion mechanisms of sensing and communication, reveals its innovative application scenarios in digital twin networks, computing power networks, and space-air-ground integrated networks, demonstrates its advantages in high-precision sensing, efficient data processing, and real-time communication. The paper systematically examines the multi-dimensional challenges faced by ISCC technology in actual deployment, such as the complexity of system architecture design, optimization difficulties in air interface protocols, the dynamic nature of resource management and control, the severity of data security and privacy protection, and the complexity of multi-source interference management. The paper also provides a forward-looking perspective on future research directions, and emphasizes the importance of interdisciplinary theoretical innovation, standardization advancement, and systematic simulation validation.
    Reference | Related Articles | Metrics
    Abstract344
    PDF473
    Survey of Model Compression for Large Language Model
    GUO Jinyang, HE Changyi, YANG Ge, LIU Xianglong
    Journal of Frontiers of Computer Science and Technology    2026, 20 (1): 1-20.   DOI: 10.3778/j.issn.1673-9418.2504069
    Large language models (LLMs) have attracted considerable attention in recent years due to their strong cognitive capabilities and widespread applications in various fields. However, their tremendous demand for computation and memory makes it difficult to deploy them in resource-constrained scenarios. Model compression and acceleration techniques have thus emerged as critical approaches to reduce computational complexity and memory usage while maintaining model performance. This paper presents a comprehensive survey of recent advances in LLM compression and acceleration methods, aiming to grasp the current development status and future trends of the entire field, promote the advancement of LLM compression and acceleration technologies, and facilitate their application and implementation in both industry and academia. It begins by outlining the challenges LLMs face in terms of computational and storage overhead. Then, it categorizes and reviews the main technical approaches, including model pruning, quantization, knowledge distillation, and low-rank decomposition, highlighting their core principles, representative methods, and cutting-edge developments. In addition, the paper provides a detailed discussion on evaluation metrics such as inference latency, accuracy retention, and deployment cost, establishing a multidimensional evaluation framework. Finally, it explores the promising future directions of LLM compression methods, aiming to guide future research and industrial deployment of compressed LLMs.
    Reference | Related Articles | Metrics
    Abstract337
    PDF460
    Automatic Prompt Engineering Technology for Large Language Models: a Survey
    BA Zezhi, ZHANG Hui, XIE Zhenghan, ZUO Xiaodong, HOU Jianwei
    Journal of Frontiers of Computer Science and Technology    2025, 19 (12): 3131-3152.   DOI: 10.3778/j.issn.1673-9418.2502027
    Prompt engineering (PE) based on prompt learning is crucial for improving the technical accessibility of LLMs and accelerating their adoption, diffusion, and application development. Compared with traditional PE, which heavily relies on the domain knowledge and experience of prompt designers and is less adaptable to tasks with large prompt spaces, automatic prompt engineering (APE) can generate or optimize prompts in an automatic or semi-automatic way. This enables the exploration of large-scale prompt combinations and enhances the stability of prompt generation through automated optimization techniques. However, there is currently a lack of systematic reviews on APE, which hinders subsequent researchers from quickly grasping the state of the field. Therefore, this paper keeps up with the latest research developments, systematically reviews the implementation forms of automated prompt engineering, and proposes future research directions. Based on the trade-offs in logical reasoning and performance orientation in the implementation of a APE, this paper categorizes it into four main types: APE based on chain-of-thought, APE based on machine learning models, APE based on evolutionary algorithms and plug-and-play auto-prompt systems. Subsequently, this paper conducts a comprehensive evaluation of APE techniques, constructing a theoretical explanatory framework for their working principles and assessing the applicability and limitations of each implementation form. Finally, this paper looks ahead to the development trends of APE in multimodal large models, advanced reasoning models and AI-Agents.
    Reference | Related Articles | Metrics
    Abstract468
    PDF455
    Research Survey on Emotion Recognition Large Language Models
    WU Ao, WANG Hailong, LIU Lin, SHI Wentao
    Journal of Frontiers of Computer Science and Technology    2026, 20 (3): 625-649.   DOI: 10.3778/j.issn.1673-9418.2509014
    The rapid development of large language models offers a new paradigm for emotion recognition, demonstrating significant advantages over conventional methods in complex scenario understanding, zero/few-shot generalization, and multimodal collaborative representation. This paper provides a systematic review of large-scale emotion recognition models. It first summarizes advances in unimodal emotion recognition across text, speech, visual, and physiological signals. The study then focuses on multimodal approaches, categorizing them into unified encoder architectures, hierarchical fusion architectures, and generative model architectures based on fusion strategies, and compares their design principles, key techniques, and applicable scenarios. Further analysis highlights improvements in recognition accuracy, generalization capability, and multimodal fusion, underscoring the important role and potential of large language models in this field. Finally, this paper identifies challenges such as insufficient temporal emotion modeling and cross-cultural cognitive biases, and suggests future directions including enhancing long-range temporal modeling and establishing multimodal cultural emotion benchmarks.
    Reference | Related Articles | Metrics
    Abstract223
    PDF415
    Survey of Entity Relation Extraction Based on Large Language Models
    XIA Jianglan, LI Yanling, GE Fengpei
    Journal of Frontiers of Computer Science and Technology    2025, 19 (7): 1681-1698.   DOI: 10.3778/j.issn.1673-9418.2409086
    Entity relation extraction aims to identify entity pairs and their relationships from unstructured text, serving as the foundation for many downstream tasks in natural language processing. With the development of big data and deep learning technologies, significant progress has been made in entity relation extraction research. In recent years, applying large language models to this task has become a new research trend. Large language models, with their ability to automatically extract features and strong generalization capabilities, can significantly enhance the performance of the task. This paper provides a comprehensive review of entity relation extraction methods, categorizing them into two main types based on the evolution of techniques and models. Firstly, the definitions of named entity recognition and relation extraction tasks are introduced. Next, a systematic review of the development of entity relation extraction methods is presented, with an in-depth analysis of the advantages and disadvantages of the corresponding models. On this basis, this paper focuses on the unique advantages of large language model-based methods in addressing entity relation extraction tasks. Furthermore, the characteristics of current mainstream datasets are summarized, along with common evaluation metrics for entity relation extraction, such as precision, recall, and F1 score. Finally, the challenges in current research are analyzed, and future research directions are discussed.
    Reference | Related Articles | Metrics
    Abstract408
    PDF411
    Survey of Deep Learning-Based UAV Single Object Tracking
    CHEN Long, SHI Lei, LI Zhihui, DING Meng, PAN Yilun
    Journal of Frontiers of Computer Science and Technology    2026, 20 (1): 40-65.   DOI: 10.3778/j.issn.1673-9418.2506046
    Deep learning-based UAV (unmanned aerial vehicle) single object tracking has emerged as a critical research area in computer vision, aiming to accurately track designated targets in aerial video sequences. UAV tracking presents unique challenges, including drastic perspective changes, variable target scales, and computational constraints. This survey systematically categorizes recent methods into three technical approaches: traditional Siamese networks, CNN-Transformer hybrid architectures, and full Transformer methods, focusing on advances from 2022 to 2025. This paper proposes innovative sub-classifcation frameworks, including: module replacement, feature post-fusion, and collaborative modeling for CNN-Transformer hybrid architectures; static computation, hybrid mechanisms, and dynamic computation for single-stream Transformer methods. These frameworks reveal the evolution from performance-oriented to efficiency-performance balanced optimization. Comprehensive evaluations on UAV123, DTB70, UAVDT, and VisDrone2018 datasets validate the advantages and limitations of different approaches. This paper identifies key challenges with future directions and engineering deployment guidance.
    Reference | Related Articles | Metrics
    Abstract268
    PDF410
    Overview of Research on Knowledge Graph Completion
    Anggeluma, WANG Siriguleng, SI Qintu
    Journal of Frontiers of Computer Science and Technology    2025, 19 (9): 2302-2318.   DOI: 10.3778/j.issn.1673-9418.2502028
    Knowledge graphs have been widely applied in various fields and have significantly advanced the development of artificial intelligence tasks. However, knowledge graphs still face the challenge of incompleteness, which severely limits their effectiveness in downstream applications. Knowledge graph completion tasks aim to predict missing links in the graph to address this issue of incompleteness. This paper provides a systematic review of the research background of knowledge graphs and their completion techniques, highlighting their pivotal role in artificial intelligence and natural language processing. Based on different information sources, existing completion methods are categorized into three types: structure-based, text-based, and hybrid methods. It introduces representative results for each category, compares their advantages and disadvantages, and summarizes their applicable scenarios, revealing the current technological development and evolutionary trends. This paper also explores advances in multilingual knowledge graph completion, focusing on key techniques such as cross-lingual entity alignment, and emphasizes the importance of cross-lingual knowledge sharing and unified modeling. Finally, it analyzes the challenges of knowledge graph completion in areas such as knowledge fusion and mining, and outlines future research directions.
    Reference | Related Articles | Metrics
    Abstract267
    PDF400
    Research Review of Deep Learning in Skin Lesions Image Segmentation
    MENG Xiangfu, LI Jiaxun, YU Chunlin, LU Yunxuan
    Journal of Frontiers of Computer Science and Technology    2026, 20 (1): 21-39.   DOI: 10.3778/j.issn.1673-9418.2504019
    Skin lesions exhibit a wide variety of types and complex clinical manifestations, ranging from benign conditions to malignant melanomas. Early detection and accurate segmentation of these lesions are critical for the diagnosis and treatment of skin cancer, particularly in the early identification and localization of high-risk lesions such as malignant melanoma, which can significantly improve patient survival rates. In recent years, deep learning techniques have achieved remarkable progress in skin lesion image segmentation, greatly enhancing both accuracy and efficiency. This paper presents a comprehensive review of deep learning research in the field of skin lesion image segmentation. First, various skin lesion imaging modalities and commonly used public datasets are introduced, along with a summary of standard evaluation metrics. Then, addressing the prevalent issues of noise and artifacts in images, a detailed discussion on various image preprocessing and augmentation techniques is provided. Subsequently, deep learning-based segmentation methods are elaborated, covering U-Net, Transformer, SAM (segment anything model), Mamba, and multi-network fusion models. The main architectural designs, advantages, limitations, and segmentation performance of these models are comparatively analyzed. Finally, the current challenges and issues in this field are examined, and future research directions are proposed, aiming to provide valuable insights for the continued development of skin lesion image segmentation.
    Reference | Related Articles | Metrics
    Abstract151
    PDF376
    Advances in Text Clustering Models Based on Deep Learning Approaches
    SHI Dongyan, MA Lerong, DING Cangfeng, NING Qinwei, CAO Jiangjiang
    Journal of Frontiers of Computer Science and Technology    2025, 19 (11): 2873-2894.   DOI: 10.3778/j.issn.1673-9418.2502024
    Text clustering is one of the core techniques in unsupervised learning, aiming to automatically partition large text datasets into clusters with high semantic similarity. In recent years, deep learning-based text clustering has flourished, with research focus shifting towards utilizing advanced deep learning architectures to efficiently extract text features, thereby improving clustering accuracy. Particularly, clustering strategies relying on large pre-trained language models like RoBERTa and GPT have demonstrated exceptional performance due to their powerful pre-trained feature representations. Through examples and data, this paper comprehensively reviews the development, current progress, and task characteristics of text clustering, aiming to present its latest trends and significant impact in data mining. An innovative classification method for text clustering models based on deep learning architecture features is proposed. This classification method divides models based on their core mechanisms and feature extraction paths in clustering tasks, covering a comprehensive introduction to methods ranging from traditional clustering algorithms to advanced technologies, including K-means, spectral clustering, autoencoders, generative models, graph convolutional networks, and large language models, with detailed analysis of their specific implementations. Finally, the advantages and limitations of existing methods are analyzed, and potential future research directions are discussed.
    Reference | Related Articles | Metrics
    Abstract195
    PDF364
    Research Status and Prospects of Omnidirectional Image Quality Assessment Based on Deep Learning
    TIAN Yingzhe, DONG Wu, LU Likun, MA Qian, ZHOU Ziyi, ZHANG Erqing
    Journal of Frontiers of Computer Science and Technology    2026, 20 (3): 650-670.   DOI: 10.3778/j.issn.1673-9418.2503014
    In recent years, with the rapid development of virtual reality technology, omnidirectional images have gradually gained widespread attention due to their significant role in providing immersive experiences. Existing objective image quality assessment methods have not effectively evaluated the quality of omnidirectional images, making it particularly necessary to design specialized objective quality assessment methods for omnidirectional images. This paper provides a comprehensive summary of the research progress in deep learning-based objective quality assessment methods for omnidirectional images. The characteristics of omnidirectional images are analyzed. Based on the input image type of the assessment methods, objective quality assessment methods for omnidirectional images are divided into four categories: equirectangular projection format quality assessment methods, segmented spherical projection format quality assessment methods, cubic projection format quality assessment methods, and viewport image quality assessment methods. The principles, characteristics, and performance of these methods are compared. The datasets and evaluation metrics used in objective omnidirectional image quality assessment are summarized. The future development directions of objective omnidirectional image quality assessment are discussed, and practical research ideas for subsequent studies are provided.
    Reference | Related Articles | Metrics
    Abstract93
    PDF363
    Recent Advances in Speech-Driven Gesture Generation
    ZHANG Yayu, WEN Yuhui, ZHANG Xinyu, JING Liping
    Journal of Frontiers of Computer Science and Technology    2026, 20 (3): 611-624.   DOI: 10.3778/j.issn.1673-9418.2505081
    In interpersonal communication, gestures enrich verbal information and facilitate information delivery. Speech-driven gesture generation aims to automatically synthesize natural, realistic, and contextually appropriate sequences of gestures conditioned on speech input. This research direction has attracted widespread attention in fields such as computer graphics and computer vision, holding significant application value in domains including film animation production, human-computer interaction, and virtual reality. Early rule-based methods suffer from inefficiency, while regression methods, despite improving generation efficiency, often result in gestures with repetitive motion patterns and limited expressiveness.?In recent years, generative models have further advanced this field, effectively enhancing the quality and diversity of generated gestures. Regarding speech-driven gesture generation methods based on generative models, this work summarizes and categorizes relevant research on generative adversarial networks, variational autoencoders, and diffusion models, analyzing their respective applications, advantages, and disadvantages in gesture generation. It further explores the controllability of speech-driven gesture generation in emotion expression, semantic consistency, and style transfer. Moreover, collaborative generation research combining facial expressions and gestures is discussed. Additionally, commonly used datasets and evaluation metrics are introduced, followed by experimental comparative analysis of representative methods. Finally, this paper concludes by summarizing the challenges in the field of speech-driven gesture generation and outlining future research trends.
    Reference | Related Articles | Metrics
    Abstract143
    PDF358
    Review of Access Control Research for Blockchain Data Sharing
    FENG Xinhao, LI Leixiao, LIU Dongjiang, DU Jinze, LIN Hao
    Journal of Frontiers of Computer Science and Technology    2025, 19 (8): 1981-2000.   DOI: 10.3778/j.issn.1673-9418.2410079
    With the advent of the digital age, data sharing plays a crucial role in promoting social and economic development as well as technological progress. How to effectively control access to the data sharing process while ensuring data security and privacy is an urgent problem to be solved. Firstly, the applications of access control technology in the existing blockchain projects are presented. Secondly, the research issues of access control for blockchain data sharing are formally defined. Then, the data sharing process is divided into the preparation stage, the data upload stage, and the authorization and access stage. In these three stages, the research status of blockchain data sharing access control technology is systematically organized and the advantages and limitations of related technologies are summarized. Among them, the attribute encryption, searchable encryption, homomorphic encryption, and proxy re-encryption technologies in the data upload stage of blockchain access control are analyzed in detail. Finally, the deficiencies in the existing research on blockchain data sharing access control are summarized, including the insufficiency of the dynamicity of access control policies, the difficulty in converting legal provisions into policies, the poor security of smart contracts, the insufficiently lightweight encryption algorithms, and the coarse data classification granularity. Prospects are proposed from five aspects: dynamic policy generation based on large language models, automatic conversion of legal policies using natural language processing technology, establishment of security standards for smart contracts, development of lightweight encryption algorithms, and realization of automatic data classification with the aid of machine learning technology.
    Reference | Related Articles | Metrics
    Abstract176
    PDF356
    Review of Hierarchical Research on Malicious Transactions in Blockchain
    LI Jiale, LI Leixiao, LIN Hao, DU Jinze, SHI Jianping, LIU Zhexu
    Journal of Frontiers of Computer Science and Technology    2025, 19 (10): 2559-2586.   DOI: 10.3778/j.issn.1673-9418.2411013
    Although blockchain technology has significant advantages in decentralization and security, the threat of malicious transactions latent in its layered architecture is increasingly complex, and the existing research mostly focuses on the security analysis of a single layer and lacks the systematic exploration of cross-layer attack conduction mechanism. A hierarchical malicious transaction analysis framework including the basic protocol layer, the basic chain layer, the extended solution layer, and the application layer is proposed, which deeply analyzes the hierarchical problem of malicious transactions in blockchain technology, and completely summarizes the research progress of the existing methods for detecting and defending against malicious attacks. Firstly, the malicious attacks in the above four layers are reviewed and analyzed, and the definitions and attack forms of 35 types of malicious attacks are outlined; there is a significant conduction effect between the attacks in each layer, and the key leakage in the protocol layer can expand the loss of the DeFi protocol in the application layer by several times. Secondly, the detection methods of each type of attack as well as the defense methods are introduced respectively, and the relevant technologies that can be used to defend against this type of attack are summarized. Finally, the existing security problems in each layer of the blockchain are analyzed: high power consumption of post-quantum cryptography algorithms in blockchain devices, confirmation delays and low block exit speeds, complexity and security risks of the proxy contract model, and the state growth risks of Rollups. According to this, four directions are proposed for future research: low-power design of post-quantum cryptography, dynamic block time and adaptive block exit speeds, enhancing the security and efficiency of the proxy contract model and Verkle tree constant size proof scheme for stateless clients.
    Reference | Related Articles | Metrics
    Abstract179
    PDF355
    Survey on Deep Learning Applications in Point-of-Interest Recommendation
    HUANG Ping, WANG Feng, LIU Guangteng, WU Zhongbo, LI Xiaoli, HUANG Jinzhou
    Journal of Frontiers of Computer Science and Technology    2026, 20 (3): 671-710.   DOI: 10.3778/j.issn.1673-9418.2503022
    With the proliferation of mobile devices and location-based services, massive user check-in data from location-based social networks have generated widespread attention for point-of-interest (POI) recommendation as an important location service. Addressing challenges of data sparsity, complex spatiotemporal factors, dynamic user interest changes, privacy protection, and insufficient interpretability in traditional POI recommendation methods, this paper comprehensively reviews deep learning-based POI recommendation techniques. The formal definition of POI recommendation systems is introduced, and a general framework comprising data layer, feature engineering layer, deep learning model layer, and application layer is constructed. Deep learning techniques including recurrent neural networks, long short-term memory networks, gated recurrent units, attention mechanisms, transformers, and graph neural networks are systematically analyzed for their application principles and core algorithms in POI recommendation. Mainstream dataset characteristics and evaluation metric applicability are thoroughly analyzed. Detailed classification and performance comparison are conducted for POI recommendation models based on sequence modeling, attention mechanisms, graph structures, multimodal fusion, and specific task orientations. Through practical application case analysis, the effectiveness of deep learning-driven POI recommendation systems is validated in tourist attraction recommendation, dining recommendation, urban service point recommendation, cross-city recommendation, and industrial-scale applications. Current research challenges are systematically analyzed, including technical challenges, data challenges, and application challenges, covering key issues such as computational complexity and efficiency optimization, dynamic user preference modeling, interpretability and user trust, data sparsity and cold start problems, multimodal data fusion, privacy protection and fairness. Future development trends toward computational efficiency optimization, dynamic preference modeling, intrinsic interpretability, multimodal fusion, and privacy protection are prospected.
    Reference | Related Articles | Metrics
    Abstract82
    PDF344
    Advancements in Deep Learning-Based Vehicle Trajectory Prediction Research
    FANG Jinfeng, ZHANG Zhenwei, MENG Xiangfu
    Journal of Frontiers of Computer Science and Technology    2026, 20 (2): 346-366.   DOI: 10.3778/j.issn.1673-9418.2504029
    Vehicle trajectory prediction involves using artificial intelligence methods to forecast a vehicle??s future path and behavior over a given time period. In recent years, with the continuous growth in the number of vehicles, traffic-related issues have become more prevalent, making the ability to automatically perceive, understand, and predict the next route of a vehicle increasingly vital. Additionally, the widespread adoption of various traffic data collectors has led to the generation of vast amounts of vehicle trajectory data, making trajectory prediction highly valuable in several fields, including autonomous driving. This paper aims to provide a systematic review of vehicle trajectory prediction algorithms based on deep learning. First, it summarizes the key factors influencing prediction performance, such as dataset quality and driver intent. Then, traditional trajectory prediction approaches are reviewed. Building upon this foundation, this paper focuses on deep learning-based methods, including those based on recurrent neural networks, graph convolutional networks, graph attention networks, Transformers, and other deep generative models such as generative adversarial networks and variational auto-encoders. Subsequently, commonly used datasets and evaluation metrics in the field are introduced, and various deep learning methods are compared in terms of predictive performance and generalization capability. Finally, this paper discusses major challenges in vehicle trajectory prediction, such as environmental uncertainty and behavioral variability, and provides insights into potential future research directions.
    Reference | Related Articles | Metrics
    Abstract117
    PDF343
    Review of Multi-person Abnormal Behavior Detection Based on Deep Learning
    WANG Yanjie, WANG Xiaoqiang, ZHAO Liurui, ZHUANG Xufei
    Journal of Frontiers of Computer Science and Technology    2026, 20 (2): 326-345.   DOI: 10.3778/j.issn.1673-9418.2504014
    Content with the continuous advancement of deep learning technology, abnormal behavior detection has shifted from the traditional machine learning stage to the application of deep learning methods, and the research focus of abnormal behavior detection has shifted from single-person abnormal behavior to multi-person abnormal behavior. Multi-person abnormal behavior detection based on deep learning has become a research hotspot in the field of computer vision. For multi-person abnormal behavior detection, it is necessary to select appropriate feature extraction methods and abnormal behavior detection methods according to different scenarios. In order to make researchers have a clear and systematic understanding of the existing feature extraction methods based on deep learning and abnormal behavior detection methods in multi-person scenarios, this paper systematically analyzes and summarizes the feature extraction methods based on deep learning and multi-person abnormal behavior detection methods, and looks forward to the future development direction in view of the shortcomings of the existing methods. The definition, characteristics and classification of multi-person abnormal behavior are given. According to the feature extraction method based on deep learning and the multi-person abnormal behavior detection method based on deep learning, the existing multi-person abnormal behavior detection methods are sorted out and summarized. The commonly used public abnormal behavior detection data sets are introduced, and the performance of some models on common public data sets is compared. The future research directions in this field are prospected.
    Reference | Related Articles | Metrics
    Abstract121
    PDF341
    Research on Application of U-Net and Its Variants in Automatic Segmentation of Retinal Vessels
    LIU Yanyan, DONG Yanru, ZHANG Kai, WANG Xiaoyan, WANG Xu
    Journal of Frontiers of Computer Science and Technology    2025, 19 (11): 2935-2949.   DOI: 10.3778/j.issn.1673-9418.2412031
    Research on retinal vessel segmentation aims to facilitate the early diagnosis and pathological analysis of fundus diseases, providing crucial support for doctors to assess patients?? ocular health. The rapid advancement of deep learning technologies has introduced novel approaches and breakthroughs in the segmentation performance of retinal vessel images. Among these, U-Net has emerged as a mainstream segmentation model in this field due to its outstanding performance. This paper comprehensively reviews recent progress in the application of U-Net and its improved models in retinal vessel segmentation. It firstly introduces commonly used datasets and evaluation metrics for retinal vessel segmentation, then gives an overview of the U-Net model and its primary structural enhancement strategies. Furthermore, the paper categorizes U-Net variants into single-network models and multi-network models. From the perspective of single-network models, it elaborates on improvements such as attention mechanisms, residual structures, multi-scale feature modules, and convolutional modules. For multi-network models, it examines enhancements like cascaded U-Net, dual-path U-Net, the integration of generative adversarial networks (GANs), and the incorporation of Transformer and Mamba models. A comparative analysis is conducted to summarize the improvements and limitations of various studies in terms of model architecture, feature extraction, performance optimization, and experimental results on public datasets. Based on this analysis, the paper discusses current challenges and future prospects in the field.
    Reference | Related Articles | Metrics
    Abstract174
    PDF339
    Review of YOLO Algorithm Research for 2D Medical Image Detection
    GUO Zhen, LIU Jing, QIU Dawei, LI Yuhao
    Journal of Frontiers of Computer Science and Technology    2026, 20 (1): 79-98.   DOI: 10.3778/j.issn.1673-9418.2502055
    In recent years, the breakthrough development of artificial intelligence technology has promoted a paradigm change in the interdisciplinary field of medicine and engineering, among which the object detection algorithm based on deep learning has shown significant advantages in medical image analysis. As a typical representative of the single-stage detection framework, the YOLO (you only look once) series algorithms have demonstrated unique advantages of high real-time, strong generalization ability and precise positioning in the field of medical image analysis through the “end-to-end” detection paradigm, and have gradually become the mainstream research methods for lesion detection, cell recognition and other tasks. The research of YOLO improved algorithm for medical object detection is sorted out. Firstly, based on the dimension of algorithm architecture innovation, the core evolution path of 12 generations of basic algorithms from YOLOv1 to YOLOv11 is sorted out, and the improvement breakthroughs, advantages and limitations, and medical scene performance of each version of YOLO are compared and analyzed. Secondly, the classic open-source datasets in the field of medical object detection are summarized, and the commonly used evaluation indicators in object detection are expounded. At the same time, the literature research on the use of YOLO improved algorithm in the detection of cervical cells, blood cells, pulmonary nodules and diabetic retinopathy in 2D medical images is reviewed, and different improved methods are comprehensively compared and analyzed. Finally, this paper summarizes the medical scenarios corresponding to different improvement ideas of YOLO, and discusses the challenges and future development directions in this field.
    Reference | Related Articles | Metrics
    Abstract155
    PDF337
    Advances in Graph Neural Networks for Combinatorial Optimization Problems
    ZHU Ye, DING Cangfeng, CAO Bohao, CHEN Kexin
    Journal of Frontiers of Computer Science and Technology    2026, 20 (2): 367-385.   DOI: 10.3778/j.issn.1673-9418.2502065
    Combinatorial optimization, as an important branch of mathematical optimization, focuses on finding optimal solution within a finite discrete solution space. It is widely applied in multiple fields such as computer science, mathematics, and economics. However, with the expansion of problem scales, traditional solution methods face significant challenges. In recent years, the rapid development of machine learning technologies has brought new opportunities to the research of combinatorial optimization. In particular, graph neural networks by virtue of their powerful structural modeling capabilities and feature learning advantages, have become a popular research direction for solving combinatorial optimization problems. Thus, this paper systematically studies the application of graph neural networks in various combinatorial optimization problems. Starting from the graph representation of combinatorial optimization problems, this paper comprehensively introduces various models and algorithms related to ordinary graph neural networks, bipartite graph neural networks, tripartite graph neural networks, and hypergraph neural networks. It deeply analyzes their application strategies and practical effects solving specific combinatorial optimization problems. Furthermore, this paper comprehensively summarizes existing research, providing an objective evaluation of the strengths and limitations of various methods in practical applications. Lastly, aiming at problems such as insufficient model generalization and poor interpretability when graph neural networks solve combinatorial optimization problems, it proposes possible future research directions, expecting to provide new ideas and inspirations for the further development of this field.
    Reference | Related Articles | Metrics
    Abstract60
    PDF331
    Visual Mamba: Structure, Practice, and Prospects
    ZHANG Xin, ZHI Min, Sarula, Arimuzha
    Journal of Frontiers of Computer Science and Technology    2026, 20 (1): 66-78.   DOI: 10.3778/j.issn.1673-9418.2503061
    Traditional convolutional neural networks (CNNs) struggle to model global features due to their limited receptive field. Though vision Transformers (ViTs) possess the advantage of sequence modeling, they face the issue of quadratic computational complexity, posing severe computational challenges for image processing. In response, researchers have begun exploring new architectures that combine efficient computation with global perception capabilities. The visual Mamba model, based on state space models (SSMs), enables global context modeling under linear computational complexity while retaining sequence modeling capabilities, marking a new stage in vision modeling based on state space models. This paper elaborates on the basic framework of the visual Mamba block, including its dual residual structure composed of residual modules, 2D selective scan (SS2D) modules, and feed-forward networks (FFN). It analyzes the working mechanisms of cross-scanning, S6 block processing, and cross-fusion within the SS2D module. The visual Mamba model is explored from three aspects: scanning methods, stacking methods, and hybrid architectures. Scanning methods include sequential scanning and dynamic scanning, with a comparative analysis of the advantages and disadvantages of different scanning strategies. Stacking methods are categorized into serial Mamba, parallel Mamba, U-shaped Mamba, and graph Mamba, with a detailed analysis on the network construction logic of each stacking structure and its adaptability in multi-scale feature extraction and long-range dependency modeling. Hybrid architectures focus on fusion forms with CNNs, Transformers, and attention mechanisms, including single-module fusion and multi-module collaborative architectures, along with an analysis of the strengths and weaknesses of each model. Through analysis, it is pointed out that the visual Mamba model overcomes the local perception limitation of CNNs and the quadratic computational complexity of Transformers. It outperforms mainstream backbone architectures in visual tasks and demonstrates tremendous potential to become a fundamental visual backbone.
    Reference | Related Articles | Metrics
    Abstract205
    PDF327
    Review of Research on Multimodal Data Fusion Methods in Meteorology
    WU Ruofei, FANG Wei, JIANG Hongru, BAO Yansong
    Journal of Frontiers of Computer Science and Technology    2026, 20 (4): 905-922.   DOI: 10.3778/j.issn.1673-9418.2508007
    Advancements in multi-source observation technologies have ushered in the era of multimodal meteorological data. However, single-modal data is inherently limited in its ability to characterize the complex and dynamic atmospheric system, failing to meet the demand for more precise predictions. Consequently, integrating multimodal data to leverage complementary information and enhance performance has become a key research frontier in meteorology. This paper addresses the core challenge of effectively integrating meteorological data from different modalities by providing a systematic review of multimodal data fusion methods. First, this paper traces the evolution of multimodal fusion techniques, with a particular focus on deep learning-based strategies. It elaborates on the core concepts, architectural features, and advantages of mainstream models, such as encoder-decoder architectures, attention mechanisms, graph neural networks, and generative adversarial networks in the context of multimodal data fusion. Then, this paper comprehensively analyzes the current applications of this technology. Drawing on a systematic compilation of publicly available multimodal meteorological datasets, this paper reviews the technology??s application in key tasks, including precipitation nowcasting and predicting the paths and intensities of typhoons and tropical cyclones. It summarizes the research progress and effectiveness of different fusion methods in these scenarios. Furthermore, this paper analyzes the key challenges currently facing multimodal fusion in meteorology. Finally, based on the identified challenges, it outlines future research directions for the field.
    Reference | Related Articles | Metrics
    Abstract154
    PDF324
    Research and Progress of Deep Learning in Diagnosis of Upper Limb Fractures
    WEI Zongyue, QIU Dawei, LIU Jing, LI Zhenjiang, CHANG Shaohua
    Journal of Frontiers of Computer Science and Technology    2025, 19 (9): 2341-2362.   DOI: 10.3778/j.issn.1673-9418.2409083
    Upper limb fractures are common yet challenging traumatic injuries in clinical practice, where diagnostic accuracy is crucial for effective treatment and patient recovery. The traditional X-ray-based fracture diagnosis method is cumbersome, time-consuming, and difficult to meet the high demands of modern medical imaging in terms of efficiency and accuracy. Against this backdrop, deep learning-assisted diagnosis of upper limb fractures primarily leverages deep learning models for classification, detection, and segmentation of medical images, enabling the identification of abnormalities within the images. This approach enhances the speed and accuracy of diagnostic models while also providing clinicians with valuable auxiliary insights. To gain a comprehensive understanding of the current research status and advancements in deep learning techniques for upper limb fracture diagnosis, this paper first provides a detailed overview of several common types of upper limb fractures and summarizes widely used public datasets in this field. Simultaneously, it systematically reviews commonly employed evaluation metrics, facilitating a more nuanced understanding of model performance across various tasks. Secondly, an in-depth analysis is conducted on the application progress of deep learning in the three major computer vision tasks: image classification, object detection, and image segmentation. A detailed comparative analysis is performed on the primary optimization strategies of different algorithms, the issues they address, and their advantages? and limitations. Furthermore, a comprehensive summary of the interpretability of deep learning models is provided. Finally, this paper provides a comprehensive comparison in terms of dataset size, methodologies, algorithm advantages and disadvantages, and experimental results. It systematically summarizes the key challenges currently faced in upper limb fracture diagnosis and offers prospects for future research directions.
    Reference | Related Articles | Metrics
    Abstract223
    PDF323
    Review of Embedding Methods and Domain Alignment Techniques in Dual-Target Cross-Domain Recommendation
    HU Siyu, MEI Hongyan, YANG Haiyan, CHENG Nai, ZHANG Xiaoyu
    Journal of Frontiers of Computer Science and Technology    2026, 20 (3): 711-729.   DOI: 10.3778/j.issn.1673-9418.2504044
    As a key branch of cross-domain recommendation technology, dual-target cross-domain recommendation improves the efficiency of source domain and target domain recommendation synchronously by virtue of a two-way collaborative optimization mechanism, and is widely used in e-commerce, video distribution, news and information and other fields. This paper first introduces the domain hierarchy and overlapping scene characteristics in the cross-domain recommendation, and from the perspective of knowledge embedding methods, elaborates on the core principles of collaborative filtering embedding, graph embedding, and self-monitoring learning embedding, and compares and analyzes their technical characteristics and applicable scenarios. From the perspective of domain alignment technology, four mainstream domain alignment schemes based on feature mapping, decoupled representation learning, meta-learning and federated learning are emphatically compared, and their technical differences and practical values are summarized. Then, the main datasets and evaluation indicators in the dual-target cross-domain recommendation are systemically sorted out, and the adaptation criteria of each dataset and indicator are defined in combination with the characteristics of different cross-domain scenarios. Finally, based on the current research status and technical challenges, future development direction of dual-target cross-domain recommendation is prospected.
    Reference | Related Articles | Metrics
    Abstract91
    PDF316
    Review of Decentralized Federated Learning
    CHEN Lifang, XU Kailong, ZHAO Renzhe, HAN Yang, DAI Qi
    Journal of Frontiers of Computer Science and Technology    2025, 19 (10): 2648-2666.   DOI: 10.3778/j.issn.1673-9418.2501059
    With the rapid growth of large-scale heterogeneous data, centralized federated learning faces challenges in data processing and privacy protection. Decentralized federated learning addresses these issues by eliminating reliance on central servers, enhancing system fault tolerance and adaptability, while distributing communication loads and significantly improving privacy protection. This paper systematically elaborates on the fundamental principles of centralized and decentralized federated learning, highlighting their differences through multi-dimensional comparative analysis. Building on this, it delves into the technical advantages and innovative methods of decentralized federated learning in communication optimization, privacy protection mechanisms, and model aggregation strategies. Additionally, it comprehensively analyzes the application prospects and development trends of decentralized federated learning in healthcare, smart manufacturing, and smart cities. Finally, through comparing the performance of prevalent decentralized federated learning frameworks on commonly used datasets, this paper highlights their respective advantages, provides a summary of current mainstream open-source frameworks, and offers perspectives on potential technical challenges and development opportunities for future research.
    Reference | Related Articles | Metrics
    Abstract184
    PDF311
    Survey of NLP Data Augmentation Methods Based on Large Language Models
    XU Delong, LIN Min, WANG Yurong, ZHANG Shujun
    Journal of Frontiers of Computer Science and Technology    2025, 19 (6): 1395-1413.   DOI: 10.3778/j.issn.1673-9418.2410054
    Currently, large language models show great potential in the field of natural language processing (NLP), but their training process relies on a large number of high-quality samples. In low-resource scenarios, the number of existing data samples can hardly support the convergence of model training as the model size keeps increasing, and this problem has inspired researchers in related fields to investigate data augmentation methods. However, traditional data enhancement methods have limited application scope and data distortion problems in the context of large models in NLP. In contrast, data enhancement methods based on large language models can address this challenge more effectively. This paper offers a comprehensive exploration of data augmentation methods for large language models in the current NLP field and adopts a comprehensive perspective to study data enhancement in the NLP domain. Firstly, the development history of traditional data enhancement methods and big language models in the NLP domain is reviewed. Then, a variety of large language model data enhancement methods in the NLP domain at this stage are summarized, and the scope of application, advantages and limitations of each method are discussed in depth. Subsequently, data enhancement evaluation methods in the field of NLP are introduced. Finally, future research directions of data enhancement methods for large language models in the NLP domain are discussed through comparative experiments and result analyses of current methods, and prospective suggestions are made.
    Reference | Related Articles | Metrics
    Abstract373
    PDF308
    Review of Light Field Image Quality Assessment
    LIU Yi, DONG Wu, LU Likun, MA Qian, ZHOU Ziyi, ZHANG Erqing
    Journal of Frontiers of Computer Science and Technology    2025, 19 (12): 3153-3178.   DOI: 10.3778/j.issn.1673-9418.2503013
    Distortions often occur during the acquisition, compression, transmission, reconstruction, and display of light field images, degrading the visual experience of users. To address different types of distortions, researchers have proposed various quality assessment methods to accurately evaluate light field image quality. Existing reviews of light field image quality assessment methods typically classify them based on reference information or mapping techniques. However, these classification criteria fail to fully capture the unique spatial and angular characteristics of light field images. This paper provides a review of recent advancements in light field image quality assessment. Firstly, based on the representation forms of light field images, the quality assessment methods are categorized into six types: sub-aperture image-based methods, epipolar plane image-based methods, micro-lens image-based methods, pseudo video sequence-based methods, refocused image-based methods, and mixed extraction methods. Representative methods from recent years are introduced, and the strengths and limitations of each category are summarized. Then, seven commonly used light field image datasets and three performance evaluation metrics are listed, and the performance of different methods is compared and analyzed from the perspectives of representation characteristics, datasets, and model structures. Finally, future research directions are discussed, including multimodal fusion, model lightweighting, large-scale models, human visual system modeling, and high-dimensional extensions.
    Reference | Related Articles | Metrics
    Abstract120
    PDF306
    Progress in Fetal Brain Magnetic Resonance Image Processing Technologies
    LIU Mengyu, LUO Qin, YAO Xiong, WANG Jianhua, CHEN Jian
    Journal of Frontiers of Computer Science and Technology    2025, 19 (11): 2895-2912.   DOI: 10.3778/j.issn.1673-9418.2501023
    Fetal brain MRI, due to its non-invasiveness, absence of radiation, and high soft-tissue contrast, has become an important tool for assessing fetal brain development and diagnosing congenital brain abnormalities. High-quality fetal brain MR images play an important role in clinical diagnosis, treatment, and scientific research of fetal brain development. Image processing techniques can enhance the quality of fetal brain MR images, meeting the requirements for diagnosis and research. Thus, the studies in this field hold significant importance. This paper provides a brief introduction to fetal brain structure and its MR image datasets, and elaborates on six techniques, including image quality assessment, image registration, image denoising, image bias field correction, image artifact correction, and super-resolution reconstruction. Firstly, the importance of image processing technologies for fetal brain MR images is presented. Subsequently, the structure of the fetal brain and its MR image datasets are introduced. Then the six image processing techniques are introduced respectively. The research status both at home and abroad is systematically described. The performance of different methods is compared and analyzed. And the current achievements and challenges are summarized respectively. Finally, the existing issues and future research directions in the field of fetal brain MR image processing are discussed from the perspectives of technology and clinical application.
    Reference | Related Articles | Metrics
    Abstract86
    PDF303
    Research Progress of Deep Learning in Classification and Diagnosis of Melanoma
    JIANG Runze, LIU Jing, MA Jingang, GUO Zhen, LI Ming
    Journal of Frontiers of Computer Science and Technology    2025, 19 (10): 2615-2634.   DOI: 10.3778/j.issn.1673-9418.2502059
    As the most lethal type of skin cancer, early and accurate diagnosis of melanoma is essential to improve the survival rate of patients. In recent years, deep learning technology has shown great potential in the field of melanoma classification and diagnosis, providing new technical support for clinical diagnosis. This paper systematically reviews the research progress of deep learning in melanoma classification, focusing on the technical evolution and clinical application of core methods such as convolutional neural networks, Transformers, generative adversarial networks and recurrent neural networks. Firstly, the characteristics of authoritative datasets such as HAM10000, ISIC, and PH2 and their value in algorithm development are summarized, and the preprocessing methods and enhancement strategies of different datasets are analyzed in detail, which provides a high-quality data basis for model training. Secondly, the improvement strategies of different deep learning models are deeply analyzed, including network architecture optimization, multimodal feature fusion, and data imbalance processing. In addition, the role of multiple learning strategies such as transfer learning and ensemble learning in improving model performance is also discussed. Finally, the limitations of current technology are summarized, and future research directions are prospected, including the application prospects of multimodal large models, federated learning and lightweight technology.
    Reference | Related Articles | Metrics
    Abstract122
    PDF302
    Review of False Information Detection Frameworks Based on Large Language Models
    ZHANG Xin, SUN Jingchao
    Journal of Frontiers of Computer Science and Technology    2025, 19 (6): 1414-1436.   DOI: 10.3778/j.issn.1673-9418.2411001
    Globally, the spread of false information on the Internet, especially on social media, has become an urgent issue to be addressed. With the rise of artificial intelligence technology, the application research of large language models in false information detection has become a hot topic. However, in China, related research in this field is relatively scarce and has not yet formed a complete system. To systematically review the current research status and development trends, this paper provides a comprehensive summary of the application of large language models in false information detection. This paper focuses on the false information detection framework based on large language models and deeply explores the innovative applications of large language models in data generation, data augmentation, information extraction, integration with external knowledge and tools, model improvement, final fusion decision-making, explanation and feedback generation during the false information detection process. It outlines the definition of false information and the background of its spread, elaborates on the core detection process in the framework, sorts out the innovation points in each link of the false information detection framework, summarizes the “internal” and “external” detection processes, and expounds on the model improvements such as retrieval enhancement, prompt engineering, fine-tuning, and final decision-making involved in the detection process. Finally, it analyzes the challenges faced by false information detection based on large language models at present and looks forward to future research directions, with the aim of providing references and inspirations for the development of false information detection based on large language models.
    Reference | Related Articles | Metrics
    Abstract371
    PDF302