Most Read articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All

    In last 3 years
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Deep Learning Applied to Time Series Prediction
    LIANG Hongtao, LIU Shuo, DU Junwei, HU Qiang, YU Xu
    Journal of Frontiers of Computer Science and Technology    2023, 17 (6): 1285-1300.   DOI: 10.3778/j.issn.1673-9418.2211108
    The time series is generally a set of random variables that are observed and collected at a certain frequency in the course of something??s development. The task of time series forecasting is to extract the core patterns from a large amount of data and to make accurate estimates of future data based on known factors. Due to the access of a large number of IoT data collection devices, the explosive growth of multidimensional data and the increasingly demanding requirements for prediction accuracy, it is difficult for classical parametric models and traditional machine learning algorithms to meet high efficiency and high accuracy requirements of prediction tasks. In recent years, deep learning algorithms represented by convolutional neural networks, recurrent neural networks and Trans-former models have achieved fruitful results in time series forecasting tasks. To further promote the development of time series prediction technology, common characteristics of time series data, evaluation indexes of datasets and models are reviewed, and the characteristics, advantages and limitations of each prediction algorithm are experimentally compared and analyzed with time and algorithm architecture as the main research line. Several time series prediction methods based on Transformer model are highlighted and compared. Finally, according to the problems and challenges of deep learning applied to time series prediction tasks, this paper provides an outlook on the future research trends in this direction.
    Reference | Related Articles | Metrics
    Abstract3400
    PDF3803
    Review on Named Entity Recognition
    LI Dongmei, LUO Sisi, ZHANG Xiaoping, XU Fu
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 1954-1968.   DOI: 10.3778/j.issn.1673-9418.2112109

    In the field of natural language processing, named entity recognition is the first key step of information extraction. Named entity recognition task aims to recognize named entities from a large number of unstructured texts and classify them into predefined types. Named entity recognition provides basic support for many natural language processing tasks such as relationship extraction, text summarization, machine translation, etc. This paper first introduces the definition of named entity recognition, research difficulties, particularity of Chinese named entity recognition, and summarizes the common Chinese and English public datasets and evaluation criteria in named entity recognition tasks. Then, according to the development history of named entity recognition, the existing named entity recognition methods are investigated, which are the early named entity recognition methods based on rules and dictionaries, the named entity recognition methods based on statistic and machine learning, and the named entity recognition methods based on deep learning. This paper summarizes the key ideas, advantages and disadvan-tages and representative models of each named entity recognition method, and summarizes the Chinese named entity recognition methods in each stage. In particular, the latest named entity recognition based on Transformer and based on prompt learning are reviewed, which are state-of-the-art in deep learning-based named entity recognition methods. Finally, the challenges and future research trends of named entity recognition are discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract2618
    PDF1539
    HTML682
    Research on Question Answering System on Joint of Knowledge Graph and Large Language Models
    ZHANG Heyi, WANG Xin, HAN Lifan, LI Zhao, CHEN Zirui, CHEN Zhe
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2377-2388.   DOI: 10.3778/j.issn.1673-9418.2308070
    The large language model (LLM), including ChatGPT, has shown outstanding performance in understanding and responding to human instructions, and has a profound impact on natural language question answering (Q&A). However, due to the lack of training in the vertical field, the performance of LLM in the vertical field is not ideal. In addition, due to its high hardware requirements, training and deploying LLM remains difficult. In order to address these challenges, this paper takes the application of traditional Chinese medicine formulas as an example, collects the domain related data and preprocesses the data. Based on LLM and knowledge graph, a vertical domain Q&A system is designed. The system has the following capabilities: (1) Information filtering. Filter out vertical domain related questions and input them into LLM to answer. (2) Professional Q&A. Generate answers with more professional knowledge based on LLM and self-built knowledge base. Compared with the fine-tuning method of introducing professional data, using this technology can deploy large vertical domain models without the need for retraining. (3) Extract conversion. By strengthening the information extraction ability of LLM and utilizing its generated natural language responses, structured knowledge is extracted and matched with a professional knowledge graph for professional verification. At the same time, structured knowledge can be transformed into readable natural language, achieving a deep integration of large models and knowledge graphs. Finally, the effect of the system is demonstrated and the performance of the system is verified from both subjective and objective perspectives through two experiments of subjective evaluation of experts and objective evaluation of multiple choice questions.
    Reference | Related Articles | Metrics
    Abstract2557
    PDF2479
    Survey on Sequence Data Augmentation
    GE Yizhou, XU Xiang, YANG Suorong, ZHOU Qing, SHEN Furao
    Journal of Frontiers of Computer Science and Technology    2021, 15 (7): 1207-1219.   DOI: 10.3778/j.issn.1673-9418.2012062

    To pursue higher accuracy, the structure of deep learning model is getting more and more complex, with deeper and deeper network. The increase in the number of parameters means that more data are needed to train the model. However, manually labeling data is costly, and it is not easy to collect data in some specific fields limited by objective reasons. As a result, data insufficiency is a very common problem. Data augmentation is here to alleviate the problem by artificially generating new data. The success of data augmentation in the field of computer vision leads people to consider using similar methods on sequence data. In this paper, not only the time-domain methods such as flipping and cropping but also some augmentation methods in frequency domain are described. In addition to experience-based or knowledge-based methods, detailed descriptions on machine learning models used for automatic data generation such as GAN are also included. Methods that have been widely applied to various sequence data such as text, audio and time series are mentioned with their satisfactory performance in issues like medical diagnosis and emotion classification. Despite the difference in data type, these methods are designed with similar ideas. Using these ideas as a clue, various data augmentation methods applied to different types of sequence data are introduced, and some discussions and prospects are made.

    Reference | Related Articles | Metrics
    Abstract2364
    PDF2300
    Survey of Camouflaged Object Detection Based on Deep Learning
    SHI Caijuan, REN Bijuan, WANG Ziwen, YAN Jinwei, SHI Ze
    Journal of Frontiers of Computer Science and Technology    2022, 16 (12): 2734-2751.   DOI: 10.3778/j.issn.1673-9418.2206078

    Camouflaged object detection (COD) based on deep learning is an emerging visual detection task, which aims to detect the camouflaged objects “perfectly” embedded in the surrounding environment. However, most exiting work primarily focuses on building different COD models with little summary work for the existing methods. Therefore, this paper summarizes the existing COD methods based on deep learning and discusses the future development of COD. Firstly, 23 existing COD models based on deep learning are introduced and analyzed according to five detection mechanisms: coarse-to-fine strategy, multi-task learning strategy, confidence-aware learning strategy, multi-source information fusion strategy and transformer-based strategy. The advantages and disadvantages of each strategy are analyzed in depth. And then, 4 widely used datasets and 4 evaluation metrics for COD are introduced. In addition, the performance of the existing COD models based on deep learning is compared on four datasets, including quantitative comparison, visual comparison, efficiency analysis, and the detection effects on camouflaged objects of different types. Furthermore, the practical applications of COD in medicine, industry, agriculture, military, art, etc. are mentioned. Finally, the deficiencies and challenges of existing methods in complex scenes, multi-scale objects, real-time performance, practical application requirements, and COD in other multimodalities are pointed out, and the potential directions of COD are discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract2252
    PDF1391
    HTML351
    Summary of Multi-modal Sentiment Analysis Technology
    LIU Jiming, ZHANG Peixiang, LIU Ying, ZHANG Weidong, FANG Jie
    Journal of Frontiers of Computer Science and Technology    2021, 15 (7): 1165-1182.   DOI: 10.3778/j.issn.1673-9418.2012075

    Sentiment analysis refers to the use of computers to automatically analyze and determine the emotions that people want to express. It can play a significant role in human-computer interaction and criminal investigation and solving cases. The advancement of deep learning and traditional feature extraction algorithms provides conditions for the use of multiple modalities for sentiment analysis. Combining multiple modalities for sentiment analysis can make up for the instability and limitations of single-modal sentiment analysis, and can effectively improve accuracy. In recent years, researchers have used three modalities of facial expression information, text information, and voice information to perform sentiment analysis. This paper mainly summarizes the multi-modal sentiment analysis technology from these three modalities. Firstly, it briefly introduces the basic concepts and research status of multi-modal sentiment analysis. Secondly, it summarizes the commonly used multi-modal sentiment analysis datasets. It gives a brief description of the existing single-modal emotion analysis technology based on facial expression information, text information and voice information. Next, the modal fusion technology is introduced in detail, and the existing results of the multi-modal sentiment analysis technology are mainly described according to different modal fusion methods. Finally, it discusses the problems of multi-modal sentiment analysis and future development direction.

    Reference | Related Articles | Metrics
    Abstract2072
    PDF2207
    Survey of Chinese Named Entity Recognition
    ZHAO Shan, LUO Rui, CAI Zhiping
    Journal of Frontiers of Computer Science and Technology    2022, 16 (2): 296-304.   DOI: 10.3778/j.issn.1673-9418.2107031

    The Chinese named entity recognition (NER) task is a sub-task within the information extraction domain, where the task goal is to find, identify and classify relevant entities, such as names of people, places and organizations, from sentences given a piece of unstructured text. Chinese named entity recognition is a fundamental task in the field of natural language processing (NLP) and plays an important role in many downstream NLP tasks, including information retrieval, relationship extraction and question and answer systems. This paper provides a comprehensive review of existing neural network-based word-character lattice structures for Chinese NER models. Firstly, this paper introduces that Chinese NER is more difficult than English NER, and there are difficulties and challenges such as difficulty in determining the boundaries of Chinese text-related entities and complex Chinese grammatical structures. Secondly, this paper investigates the most representative lattice-structured Chinese NER models under different neural network architectures (RNN (recurrent neural network), CNN (convolutional neural network), GNN (graph neural network) and Transformer). Since word sequence information can capture more boundary information for character-based sequence learning, in order to explicitly exploit the lexical information associated with each character, some prior work has proposed integrating word information into character sequences via word-character lattice structures. These neural network-based word-character lattice structures perform significantly better than word-based or character-based approaches on the Chinese NER task. Finally, this paper introduces the dataset and evaluation criteria of Chinese NER.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1999
    PDF2019
    HTML1214
    Review of Deep Learning Applied to Occluded Object Detection
    SUN Fangwei, LI Chengyang, XIE Yongqiang, LI Zhongbo, YANG Caidong, QI Jin
    Journal of Frontiers of Computer Science and Technology    2022, 16 (6): 1243-1259.   DOI: 10.3778/j.issn.1673-9418.2112035

    Occluded object detection has long been a difficulty and hot topic in the field of computer vision. Based on convolutional neural network, the deep learning takes the object detection task as a classification and regression task to handle, and obtains remarkable achievements. The mask confuses the features of object when the object is occluded, making the deep convolutional neural network cannot handle it well and reducing the performance of detector in ideal scenes. Considering the universality of occlusion in reality, the effective detection of occluded object has important research value. In order to further promote the development of occluded object detection, this paper makes a comprehensive summary of occluded object detection algorithms, and makes a reasonable classification and analysis. First of all, based on a simple overview of object detection, this paper introduces the relevant theoretic background, research difficulties and datasets about occluded object detection. After, this paper focuses on the algo-rithms to improve the performance of occluded object detection from the aspects of object structure, loss function, non-maximum suppression and semantic partial. This paper compares the performance of different detection algo-rithms after summarizing the relationship and development of various algorithms. Finally, this paper points out the difficulties of occluded object detection and looks forward to its future development directions.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1965
    PDF917
    HTML279
    Survey of Research on Deep Learning Image-Text Cross-Modal Retrieval
    LIU Ying, GUO Yingying, FANG Jie, FAN Jiulun, HAO Yu, LIU Jiming
    Journal of Frontiers of Computer Science and Technology    2022, 16 (3): 489-511.   DOI: 10.3778/j.issn.1673-9418.2107076

    As the rapid development of deep neural networks, multi-modal learning techniques are widely concerned. Cross-modal retrieval is an important branch of multimodal learning. Its fundamental purpose is to reveal the relation between different modal samples by retrieving modal samples with identical semantics. In recent years, cross-modal retrieval has gradually become the forefront and hot spot of academic research. It’s an important direction in the future development of information retrieval. This paper focuses on the latest development of cross-modal retrieval based on deep learning, reviews the development trends of real value representation-based and binary representation-based learning methods systematically. Among them, the real value representation-based method is adopted to improve the semantic relevance, and improve the accuracy, and the binary representation-based learning method is used to improve the efficiency of image-text cross-modal retrieval and reduce storage space. In addition, the common open datasets in the field of image-text cross-modal retrieval are summarized, and the performance of various algorithms on different datasets is compared. Especially, this paper summarizes and analyzes the specified implementations of cross-modal retrieval techniques in the fields of public security, media and medicine. Finally, combined with the state-of-the-art technologies, development trends and future research directions are discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1903
    PDF2172
    HTML1357
    Survey on Cross-Chain Protocols of Blockchain
    MENG Bo, WANG Yibing, ZHAO Can, WANG Dejun, MA Binhao
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2177-2192.   DOI: 10.3778/j.issn.1673-9418.2203032

    With the development of blockchain technology, due to the different system architecture and application scenarios of blockchain platforms, it is difficult to realize the interconnection and intercommunication of data and assets on different blockchains, which affects the promotion and application of blockchain. The cross-chain tech-nology of blockchain is an important technical solution to realize the interconnection of blockchain and improve the interoperability and extensibility of blockchain. The blockchain cross-chain protocol is the specific design specifi-cations to realize the cross-chain interoperability between different blockchains through cross-chain technology, so it is of great significance to the realization of blockchain interoperability and the construction of blockchain cross-chain application. This paper systematically arranges and analyzes the latest researches on the integration and implemen-tation of blockchain cross-chain protocols, and places them in four hierarchies: Firstly, the current research status of blockchain cross-chain interoperability is explained from three aspects, Internet of blockchains, cross-chain techno-logy and blockchain interoperability. Secondly, the cross-chain protocols are divided into cross-chain communi-cation agreements, cross-chain asset transaction agreements and cross-chain smart contract call agreements, and the latest research is analyzed. Thirdly, the key design principles of cross-chain protocols are summarized, and the solutions for the problems of security, privacy and scalability of cross-chain protocol are provided. Finally, com-bined with the actual needs of blockchain cross-chain applications, the future research direction of blockchain cross-chain protocol is given.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1699
    PDF1409
    HTML603
    Survey of Research on Image Inpainting Methods
    LUO Haiyin, ZHENG Yuhui
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2193-2218.   DOI: 10.3778/j.issn.1673-9418.2204101

    Image inpainting refers to restoring the pixels in damaged areas of an image to make them as consistent as possible with the original image. Image inpainting is not only crucial in the computer vision tasks, but also serves as an important cornerstone of other image processing tasks. However, there are few researches related to image inpainting. In order to better learn and promote the research of image inpainting tasks, the classic image inpainting algorithms and representative deep learning image inpainting methods in the past ten years are reviewed and analyzed. Firstly, the classical traditional image inpainting methods are briefly summarized, and divided into partial differential equation-based and sample-based image inpainting methods, and the limitations of traditional image methods are further analyzed. Deep learning image inpainting methods are divided into single image inpainting and pluralistic image inpainting according to the number of output images of the model, and different methods are analyzed and summarized in combination with application images, loss functions, types, advantages, and limitations. After that, the commonly used datasets and quantitative evaluation indicators of image inpainting methods are described in detail, and the quantitative data of image inpainting methods to inpaint damaged areas of different areas on different image datasets are given. According to the quantitative data, the performance of image inpainting methods based on deep learning is compared and analyzed. Finally, the limitations of existing image inpainting methods are summarized and analyzed, and new ideas and prospects for future key research directions are proposed.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1691
    PDF1086
    HTML520
    Improved Sparrow Algorithm Combining Cauchy Mutation and Opposition-Based Learning
    MAO Qinghua, ZHANG Qiang
    Journal of Frontiers of Computer Science and Technology    2021, 15 (6): 1155-1164.   DOI: 10.3778/j.issn.1673-9418.2010032

    Aiming at the problem that the population diversity of basic sparrow search algorithm decreases and it is easy to fall into local extremum in the late iteration, an improved sparrow search algorithm combining Cauchy variation and reverse learning (ISSA) is proposed. Firstly, this paper uses a Sin chaotic initialization population with an unlimited number of mapping folds to lay the foundation for global optimization. Secondly, this paper introduces the previous generation global optimal solution into the discoverer location-update method to enhance the sufficiency of global search. At the same time, the adaptive weight is added to coordinate the ability of local mining and global exploration, and the convergence speed is accelerated. Then, the Cauchy mutation operator and the opposition-based learning strategy are combined to perform disturbance mutation to generate new solutions at the optimal solution position, and the algorithm??s ability to jump out of local space is enhanced. Finally, this algorithm is compared with 3 basic algorithms and 2 improved sparrow algorithms. Simulation and Wilcoxon rank and inspection are performed on 8 benchmark test functions. The optimization performance of ISSA is assessed, and time complexity analysis of ISSA is carried out. The results show that ISSA has faster convergence rate and higher precision than the other 5 algorithms. And the overall optimization capabilities are improved.

    Reference | Related Articles | Metrics
    Abstract1671
    PDF966
    Survey on Pseudo-Labeling Methods in Deep Semi-supervised Learning
    LIU Yafen, ZHENG Yifeng, JIANG Lingyi, LI Guohe, ZHANG Wenjie
    Journal of Frontiers of Computer Science and Technology    2022, 16 (6): 1279-1290.   DOI: 10.3778/j.issn.1673-9418.2111144

    With the development of intelligent technology, deep learning has become a hot topic in machine learning. It is playing a more and more important role in various fields. Deep learning requires a lot of labeled data to imp-rove model performance. Therefore, researchers effectively combine semi-supervised learning with deep learning to solve the labeled data problem. It utilizes a small amount of labeled data and a large amount of unlabeled data to build the model simultaneously. It can help to expand the sample space. In view of its theoretical significance and practical application value, this paper focuses on the pseudo-labeling methods as the starting point. Firstly, deep semi-supervised learning is introduced and the advantage of pseudo-labeling methods is pointed out. Secondly, the pseudo-labeling methods are described from self-training and multi-view training and the existing model is comprehensively analyzed. And then, the label propagation method based on graph and pseudo-labeling is introduced. Furthermore, the existing pseudo-labeling methods are analyzed and compared. Finally, the problems and future research direction of pseudo-labeling methods are summarized from the utility of unlabeled data, noise data, rationality, and the combi-nation of pseudo-labeling methods.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1647
    PDF1428
    HTML213
    Survey on Deep Learning Based News Recommendation Algorithm
    TIAN Xuan, DING Qi, LIAO Zihui, SUN Guodong
    Journal of Frontiers of Computer Science and Technology    2021, 15 (6): 971-998.   DOI: 10.3778/j.issn.1673-9418.2007021

    News recommendation (NR) can effectively alleviate the overload of news information, and it is an important way to obtain news information for users. Deep learning (DL) has become a mainstream technology to promote the development of NR in recent years, and the effect of news recommendation has been significantly improved, which is widely concerned by researchers. In this paper, the methods of deep learning-based news recommendation (DNR) are classified, analyzed and summarized. In the research of NR, modeling users or news are two key tasks. According to different strategies of modeling users or news, the news recommendation methods based on deep learning are divided into three types: “two-stage” method, “fusion” method and “collaboration” method. Each type of method is further subdivided in terms of sub-tasks or the data organization structure based on. The representative models of each method are introduced and analyzed, and their advantages and limitations are evaluated. The characteristics, advantages and disadvantages of each type of methods are also summarized in detail. Furthermore, the commonly used datasets, baseline and performance evaluation indicators are introduced. Finally, the possible future research directions and development trends in this field are analyzed and predicted.

    Reference | Related Articles | Metrics
    Abstract1612
    PDF1638
    Survey of Few-Shot Object Detection
    LIU Chunlei, CHEN Tian‘en, WANG Cong, JIANG Shuwen, CHEN Dong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 53-73.   DOI: 10.3778/j.issn.1673-9418.2206020
    Object detection as a hot field in computer vision, usually requires a large number of labeled images for model training, which will cost a lot of manpower and material resources. At the same time, due to the inherent long-tailed distribution of data in the real world, the number of samples of most objects is relatively small, such as many uncommon diseases, etc., and it is difficult to obtain a large number of labeled images. In this regard, few-shot object detection only needs to provide a small amount of annotation information to detect objects of interest. This paper makes a detailed review of few-shot object detection methods. Firstly, the development of general target detection and its existing problems are reviewed, the concept of few-shot object detection is introduced, and other tasks related to few-shot object detection are differentiated and explained. Then, two classical paradigms based on transfer learning and meta-learning for existing few-shot object detection are introduced. According to the improvement strategies of different methods, few-shot object detection is divided into four types: attention mechanism, graph convolutional neural network, metric learning and data augmentation. The public datasets and evaluation metrics used in these methods are explained. Advantages, disadvantages, applicable scenarios of different methods, and performance on different datasets are compared and analyzed. Finally, the practical application fields and future research trends of few-shot object detection are discussed.
    Reference | Related Articles | Metrics
    Abstract1604
    PDF1539
    Survey of One-Stage Small Object Detection Methods in Deep Learning
    LI Kecen, WANG Xiaoqiang, LIN Hao, LI Leixiao, YANG Yanyan, MENG Chuang, GAO Jing
    Journal of Frontiers of Computer Science and Technology    2022, 16 (1): 41-58.   DOI: 10.3778/j.issn.1673-9418.2110003

    With the development of deep learning, object detection technology has gradually changed from traditional manual detection methods to deep neural network detection methods. Among many object detection algorithms based on deep learning, the one-stage object detection method based on deep learning is widely used because of its simple network structure, fast running speed and higher detection efficiency. However, the existing one-stage object detection methods based on deep learning do not have ideal detection results for small target objects in the detection process due to the lack of feature information, low resolution, complicated background information, unobvious details and higher positioning accuracy, which reduces the detection accuracy of the model. Aiming at the existing problems of one-stage object detection method based on deep learning, a large amount of one-stage small object detection technologies based on deep learning are studied. Firstly, the optimization methods for small object detection are systematically summarized from the aspects of Anchor Box, network structure, IoU (intersection over union) and loss function in the one-stage object detection methods. Secondly, the commonly used small object detection datasets and their application fields are listed, and the detection graphs on each small object detection dataset are given. Finally, the future research direction of one-stage small object detection methods based on deep learning is investigated.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1572
    PDF1727
    HTML360
    Survey of Graph Neural Network in Recommendation System
    WU Jing, XIE Hui, JIANG Huowen
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2249-2263.   DOI: 10.3778/j.issn.1673-9418.2203004

    Recommendation system (RS) was introduced because of a lot of information. Due to the diversity, complexity, and sparseness of data, traditional recommendation system can not solve the current problem well. Graph neural network (GNN) can extract and represent the features from edges and nodes data in the graphs and has inherent advantages in processing the graphs structure data, so it flourishes in recommendation system. This paper sorts out the main references of graph neural network in recommendation system in recent years, focuses on the two perspectives of method and problem, and systematically reviews graph neural network in recommendation system. Firstly, from the method level, five graph neural networks of the recommendation system are elaborated, including the graph convolutional network in the recommendation system, graph attention network in the recommendation system, graph autoencoder in the recommendation system, graph generation network in the recommendation system and graph spatial-temporal network in the recommendation system. Secondly, from the perspective of problem similarity, six major problem types are summarized: sequence recommendation, social recommendation, cross-domain recommendation, multi-behavior recommendation, bundle recommendation, and session-based recommen-dation. Finally, based on the analysis and summary of the existing methods, this paper points out the main difficu-lties in the current research on graph neural network in recommendation system, proposes the corresponding issues that can be investigated, and looks forward to the future research directions on this topic.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1555
    PDF1160
    HTML514
    Incremental Construction of Time-Series Knowledge Graph
    ZHANG Zichen, YUE Kun, QI Zhiwei, DUAN Liang
    Journal of Frontiers of Computer Science and Technology    2022, 16 (3): 598-607.   DOI: 10.3778/j.issn.1673-9418.2009068

    Knowledge graph (KG) with time-series feature is referred to as time-series KG, which depicts the incre-mental concepts and corresponding relations in knowledge base. In view of knowledge being dramatically changing, by adding new knowledge to time-series KG, the evolution and update of knowledge can be reflected in time. Thus, this paper gives the definition of time-series KG and proposes the method for its incremental construction model based on TransH. In order to add new and relevant triple set to time-series KG, this paper proposes a model for calculating the coincidence between the triple and the current KG, and the technique for extracting the optimal triples by the idea of greedy algorithm. Then, the optimal set of triples is added to the time-series KG and the incremental update is fulfilled. Experimental results show that optimal triples can be extracted efficiently and added into the time-series KG by the proposed method. The effectiveness and efficiency of the method are verified.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1546
    PDF939
    HTML2901
    Survey of Video Object Detection Based on Deep Learning
    WANG Dicong, BAI Chenshuai, WU Kaijun
    Journal of Frontiers of Computer Science and Technology    2021, 15 (9): 1563-1577.   DOI: 10.3778/j.issn.1673-9418.2103107

    Video object detection is to solve the problem of object localization and recognition in every video frame. Compared with image object detection, video is featured by high redundancy, which contains a lot of local spatio-temporal information. With the rapid popularity of deep convolutional neural network in the field of static image object detection, it shows a great advantage over traditional methods in performance. Besides, it plays a due role in video-based object detection task. However, the current video object detection algorithms still face many challenges, such as improving and optimizing the performance of mainstream object detection algorithms, maintaining the spatiotemporal consistency of video sequences, and making detection of model lightweight. In view of the above problems and challenges, on the basis of investigating a large number of literature, this paper systematically sum-marizes the video object detection algorithm based on deep learning. Based on the basic methods like optical flow and detection, these algorithms are classified. In addition, in the angles of backbone network, algorithm structure and data sets etc., these methods are explored. Combined with the experimental results in the ImageNet VID data set, this paper analyzes the performance advantages and disadvantages of typical algorithms of this field, and the relations between these algorithms. As for video object detection, the problems to be solved as well as the future research direction are expounded and prospected. Video object detection has become a hot spot pursued by many computer vision scholars. More efficient and accurate algorithms will be proposed, and its development direction will be better and better.

    Reference | Related Articles | Metrics
    Abstract1499
    PDF2020
    Survey of Affective-Based Dialogue System
    ZHUANG Yin, LIU Zhen, LIU Tingting, WANG Yuanyi, LIU Cuijuan, CHAI Yanjie
    Journal of Frontiers of Computer Science and Technology    2021, 15 (5): 825-837.   DOI: 10.3778/j.issn.1673-9418.2012012

    As an important way of human-computer interaction, the dialogue system has broad application prospects. Existing dialogue systems focus on solving problems such as semantic consistency and content richness and paying little attention to improving human-computer interaction and human-computer resonance. How to make the generated sentences communicate with users more naturally on the basis of semantic relevance is one of the main problems in current dialogue system. First, it summarizes the overall situation of the dialogue system. Then it introduces the two major tasks of dialogue emotion perception and emotional dialogue generation in the emotional dialogue system. And further it investigates and summarizes related methods respectively. Dialogue emotion perception tasks are roughly divided into context-based and user-based methods. The emotional dialogue generation methods include rule matching algorithms, specified emotional response generation models, and non-specified emotional response generation models. The models are compared and analyzed in terms of emotional data categories and model methods. Next, for subsequent research, it summarizes characteristics and links of the data sets under the two major tasks. Further, different evaluation methods in the current emotional dialogue system are summarized. Finally, the work of the emotional dialogue system is summarized and prospected.

    Reference | Related Articles | Metrics
    Abstract1496
    PDF2024
    Review of Image Super-resolution Reconstruction Algorithms Based on Deep Learning
    YANG Caidong, LI Chengyang, LI Zhongbo, XIE Yongqiang, SUN Fangwei, QI Jin
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 1990-2010.   DOI: 10.3778/j.issn.1673-9418.2202063

    The essence of image super-resolution reconstruction technology is to break through the limitation of hardware conditions, and reconstruct a high-resolution image from a low-resolution image which contains less infor-mation through the image super-resolution reconstruction algorithms. With the development of the technology on deep learning, deep learning has been introduced into the image super-resolution reconstruction field. This paper summarizes the image super-resolution reconstruction algorithms based on deep learning, classifies, analyzes and compares the typical algorithms. Firstly, the model framework, upsampling method, nonlinear mapping learning module and loss function of single image super-resolution reconstruction method are introduced in detail. Secondly, the reference-based super-resolution reconstruction method is analyzed from two aspects: pixel alignment and Patch matching. Then, the benchmark datasets and image quality evaluation indices used for image super-resolution recon-struction algorithms are summarized, the characteristics and performance of the typical super-resolution recons-truction algorithms are compared and analyzed. Finally, the future research trend on the image super-resolution reconstruction algorithms based on deep learning is prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1475
    PDF794
    HTML372
    Review of Human Behavior Recognition Research
    PEI Lishen, LIU Shaobo, ZHAO Xuezhuan
    Journal of Frontiers of Computer Science and Technology    2022, 16 (2): 305-322.   DOI: 10.3778/j.issn.1673-9418.2106055

    Behavior recognition is a hot topic in the field of computer vision. It has experienced the development process from manual design feature representation to deep learning feature expression. This paper classifies the mainstream algorithms in the development of behavior recognition from two aspects of traditional behavior recognition models and deep learning models. The traditional behavior recognition models mainly include feature description methods based on silhouette, space-time interest points, human joint point and trajectories. Among them, the improved dense trajectory method has good robustness and reliability. Deep learning network architecture mainly includes two-stream network, 3D convolution network and hybrid network. Firstly, this paper focuses on the main research ideas and innovations of each behavior recognition algorithm, and introducees the model architecture, algorithm features, application scenarios of each kind of algorithm. Then, the widely used public behavior databases are classified, and the HMDB51 and UCF101 datasets are introduced in detail. The recognition effects of traditional methods and deep learning algorithms on each dataset are compared and analyzed. Through comparative analysis, the traditional methods are not suitable for high-precision behavior recognition, and it is not easy to achieve cross database or cross scene promotion. In depth architecture, two-stream network and 3D convolution network have achieved good behavior recognition effect and are widely used. Finally, the future development of behavior recognition is prospected, and some feasible research directions in the future are pointed out.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1473
    PDF966
    HTML304
    Review of Graph Neural Networks Applied to Knowledge Graph Reasoning
    SUN Shuifa, LI Xiaolong, LI Weisheng, LEI Dajiang, LI Sihui, YANG Liu, WU Yirong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 27-52.   DOI: 10.3778/j.issn.1673-9418.2207060
    As an important element of knowledge graph construction, knowledge reasoning (KR) has always been a hot topic of research. With the deepening of knowledge graph application research and the expanding of its scope, graph neural network (GNN) based KR methods have received extensive attention due to their capability of obtaining semantic information such as entities and relationships in knowledge graph, high interpretability, and strong reasoning ability. In this paper, firstly, basic knowledge and research status of knowledge graph and KR are summarized. The advantages and disadvantages of KR approaches based on logic rules, representation learning, neural network and graph neural network are briefly introduced. Secondly, the latest progress in KR based on GNN is comprehensively summarized. GNN-based KR methods are categorized into knowledge reasoning based on recurrent graph neural networks (RecGNN), convolutional graph neural networks (ConvGNN), graph auto-encoders (GAE) and spatial-temporal graph neural networks (STGNN). Various typical network models are introduced and compared. Thirdly, this paper introduces the application of KR based on graph neural network in health care, intelligent manufacturing, military, transportation, etc. Finally, the future research directions of GNN-based KR are proposed, and related research in various directions in this rapidly growing field is discussed.
    Reference | Related Articles | Metrics
    Abstract1385
    PDF1472
    Review of Super-Resolution Image Reconstruction Algorithms
    ZHONG Mengyuan, JIANG Lin
    Journal of Frontiers of Computer Science and Technology    2022, 16 (5): 972-990.   DOI: 10.3778/j.issn.1673-9418.2111126

    In human visual perception system, high-resolution (HR) image is an important medium to clearly express its spatial structure, detailed features, edge texture and other information, and it has a very wide range of practical value in medicine, criminal investigation, satellite and other fields. Super-resolution image reconstruction (SRIR) is a key research task in the field of computer vision and image processing, which aims to reconstruct a high-resolution image with clear details from a given low-resolution (LR) image. In this paper, the concept and mathematical model of super-resolution image reconstruction are firstly described, and the image reconstruction methods are systematically classified into three kinds of super-resolution image reconstruction methods:based on interpolation, based on reconstruction, based on learning (before and after deep learning). Secondly, the typical, commonly used and latest algorithms among the three methods and their research are comprehensively reviewed and summarized, and the listed image reconstruction algorithms are combed from the aspects of network structure, learning mechanism, application scenarios, advantages and limitations. Then, the datasets and image quality evaluation indices used for super-resolution image reconstruction algorithms are summarized, and the characteristics and performance of various super-resolution image reconstruction algorithms based on deep learning are compared. Finally, the future research direction or angle of super-resolution image reconstruction is prospected from four aspects.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1384
    PDF904
    HTML304
    Research Status and Prospect of Transformer in Speech Recognition
    ZHANG Xiaoxu, MA Zhiqiang, LIU Zhiqiang, ZHU Fangyuan, WANG Chunyu
    Journal of Frontiers of Computer Science and Technology    2021, 15 (9): 1578-1594.   DOI: 10.3778/j.issn.1673-9418.2103020

    As a new deep learning algorithm framework, Transformer has attracted more and more researchers?? attention and has become a current research hotspot. Inspired by humans focusing on important things only, the self-attention mechanism in the Transformer model mainly learns important information in the input sequence. For speech recogni-tion tasks, the focus is to transcribe the information of the input speech sequence into the corresponding language text. The past practice was to combine acoustic models, pronunciation dictionaries, and language models into a speech recognition system to achieve speech recognition tasks, while Transformer can integrate them into a single neural network to form an end-to-end speech recognition system, which solves the issues such as forced alignment and multi-module training of the traditional speech recognition system. Therefore, it is very necessary to discuss the problems of Transformer in speech recognition tasks. In this paper, the structure of the Transformer model is first introduced. Besides, the problems confronted by speech recognition are analyzed with respect to input speech sequence, deep model architecture, and model inference. Then the methods to solve the obstacles within the three aspects afore mentioned are outlined and summarized. Finally, the future application and direction of Transformer in speech recognition are concluded and prospected.

    Reference | Related Articles | Metrics
    Abstract1370
    PDF1188
    Survey of Deep Learning Based Multimodal Emotion Recognition
    ZHAO Xiaoming, YANG Yijiao, ZHANG Shiqing
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1479-1503.   DOI: 10.3778/j.issn.1673-9418.2112081

    Multimodal emotion recognition aims to recognize human emotional states through different modalities related to human emotion expression such as audio, vision, text, etc. This topic is of great importance in the fields of human-computer interaction, a.pngicial intelligence, affective computing, etc., and has attracted much attention. In view of the great success of deep learning methods developed in recent years in various tasks, a variety of deep neural networks have been used to learn high-level emotional feature representations for multimodal emotion recog-nition. In order to systematically summarize the research advance of deep learning methods in the field of multi-modal emotion recognition, this paper aims to present comprehensive analysis and summarization on recent multi-modal emotion recognition literatures based on deep learning. First, the general framework of multimodal emotion recognition is given, and the commonly used multimodal emotional dataset is introduced. Then, the principle of representative deep learning techniques and its advance in recent years are briefly reviewed. Subsequently, this paper focuses on the advance of two key steps in multimodal emotion recognition: emotional feature extraction methods related to audio, vision, text, etc., including hand-crafted feature extraction and deep feature extraction; multi-modal information fusion strategies integrating different modalities. Finally, the challenges and opportunities in this field are analyzed, and the future development direction is pointed out.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1361
    PDF1102
    HTML978
    Review of Medical Image Segmentation Based on UNet
    XU Guangxian, FENG Chun, MA Fei
    Journal of Frontiers of Computer Science and Technology    2023, 17 (8): 1776-1792.   DOI: 10.3778/j.issn.1673-9418.2301044
    As one of the most important semantic segmentation frameworks in convolutional neural networks (CNN), UNet is widely used in image processing tasks such as classification, segmentation, and target detection of medical images. In this paper, the structural principles of UNet are described, and a comprehensive review of UNet-based networks and variant models is presented. The model algorithms are fully investigated from several perspectives, and an attempt is made to establish an evolutionary pattern among the models. Firstly, the UNet variant models are categorized according to the seven medical imaging systems they are applied to, and the algorithms with similar core composition are compared and described. Secondly, the principles, strengths and weaknesses, and applicable scenarios of each model are analyzed. Thirdly, the main UNet variant networks are summarized in terms of structural principles, core composition, datasets, and evaluation metrics. Finally, the inherent shortcomings and solutions of the UNet network structure are objectively described in light of the latest advances in deep learning, providing directions for continued improvement in the future. At the same time, other technological evolutions and application scenarios that can be combined with UNet are detailed, and the future development trend of UNet-based variant networks is further envisaged.
    Reference | Related Articles | Metrics
    Abstract1344
    PDF1468
    Survey of Open-Domain Knowledge Graph Question Answering
    CHEN Zirui, WANG Xin, WANG Lin, XU Dawei, JIA Yongzhe
    Journal of Frontiers of Computer Science and Technology    2021, 15 (10): 1843-1869.   DOI: 10.3778/j.issn.1673-9418.2106095

    Knowledge graph question answering (KGQA) is the procedure of processing natural language questions posed by users to obtain relevant answers from knowledge graph (KG) based on some form of KG. Due to the limitation of knowledge scale, computing power and natural language processing capability, the early knowledge base question answering systems were limited to closed-domain questions. In recent years, with the development of KG and the construction of open-domain question answering (QA) datasets, KG has been used for open-domain QA research and practice. In this paper, in accordance with the development of technology, the open-domain KGQA is summarized. Firstly, five rule and template based KGQA methods are reviewed, including traditional semantic parsing, traditional information retrieval, triplet matching, utterance template, and query template. This type of methods mainly relies on manually defined rules and templates to complete QA task. Secondly, five deep learning based KGQA methods are introduced, which use neural network models to complete the subtasks of QA process, including knowledge graph embedding, memory network, neural network-based semantic parsing, neural network-based query graph, and neural network-based information retrieval method. Thirdly, four general domain KG and eleven open-domain QA datasets, which KGQA commonly used are described. Fourthly, three classic KGQA datasets are selected according to the difficulty of questions to compare and analyze the performance metric of each KGQA system, and the effect between above methods. Finally, this paper looks forward to the future research directions on this topic.

    Reference | Related Articles | Metrics
    Abstract1335
    PDF1942
    Review of Pre-training Techniques for Natural Language Processing
    CHEN Deguang, MA Jinlin, MA Ziping, ZHOU Jie
    Journal of Frontiers of Computer Science and Technology    2021, 15 (8): 1359-1389.   DOI: 10.3778/j.issn.1673-9418.2012109

    In the published reviews of natural language pre-training technology, most literatures only elaborate neural network pre-training technologies or a brief introduction to traditional pre-training technologies, which may result in the development process of natural language pre-training dissected artificially from natural language processing. Therefore, in order to avoid this phenomenon, this paper covers the process of natural language pre-training with four points as follows. Firstly, the traditional natural language pre-training technologies and neural network pre-training technologies are introduced according to the updating route of pre-training technology. With the characteristics of related technologies analyzed, compared, this paper sums up the process of development context and trend of natural language processing technology. Secondly, based on the improved BERT (bidirectional encoder representation from transformers), this paper mainly introduces the latest natural language processing models from two aspects and sums up these models from pre-training mechanism, advantages and disadvantages, performance and so on. The main application fields of natural language processing are presented. Furthermore, this paper explores the challenges and corresponding solutions to natural language processing models. Finally, this paper summarizes the work of this paper and prospects the future development direction, which can help researchers understand the development of pre-training technologies of natural language more comprehensively and provide some ideas to design new models and new pre-training methods.

    Reference | Related Articles | Metrics
    Abstract1314
    PDF2085
    Survey of Question Answering Based on Knowledge Graph Reasoning
    SA Rina, LI Yanling, LIN Min
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1727-1741.   DOI: 10.3778/j.issn.1673-9418.2111033

    Knowledge graph question answering (KGQA) is based on analysis and understanding of questions and knowledge graph (KG) to obtain the answers. However, due to the complexity of natural language questions and the incompleteness of KG, the accuracy of answers can not be improved effectively. The knowledge graph reasoning technology can infer the missing entities in the KG and the implied relations between entities. Therefore, its application in KGQA can further improve the accuracy of answer prediction. In recent years, with the development of KGQA datasets and flexible application of knowledge graph reasoning technology, the development of the KGQA is greatly promoted. In this paper, question answering based on knowledge graph reasoning is summarized from three aspects. Firstly, this paper gives a brief overview of question answering based on knowledge graph reasoning, and introduces its challenges and related datasets. Secondly, this paper introduces the application of knowledge graph reasoning in open domain question answering, commonsense question answering and temporary knowledge question answering, and analyzes the advantages and disadvantages of each method. The open domain question answering methods are further summarized as graph embedding methods, deep learning methods and logic methods. Finally, this paper summarizes the work and prospects the future research in view of the current problems of question answering based on knowledge graph reasoning.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1241
    PDF747
    HTML387
    Survey of Research on Instance Segmentation Methods
    HUANG TAO, LI Hua, ZHOU Gui, LI Shaobo, WANG Yang
    Journal of Frontiers of Computer Science and Technology    2023, 17 (4): 810-825.   DOI: 10.3778/j.issn.1673-9418.2209051
    In recent years, with the continuous improvement of computing level, the research of instance segment-ation methods based on deep learning has made great breakthroughs. Image instance segmentation can distinguish different instances of the same class in images, which is an important research direction in the field of computer vision with broad research prospects, and has achieved great actual application value in scene comprehension, medical image analysis, machine vision, augmented reality, image compression, video monitoring, etc. Recently, instance segmentation methods have been updated more and more frequently, but there is a little literature to comprehensively and systematically analyze the research background related to instance segmentation. This paper  provides a comprehensive and systematic analysis and summary of the image instance segmentation methods based on deep learning. Firstly, this paper introduces the currently used common public datasets and evaluation indexes in instance segmentation, and analyzes the challenges of current datasets. Secondly, this paper respectively combs and summarizes the instance segmentation algorithms in the characteristics of two-stage segmentation methods and single-stage segmentation methods, elaborates their central ideas and design thoughts, and summarizes the advantages and shortcomings of the two types of methods. Thirdly, this paper evaluates the segmentation accuracy and speed of the models on a public dataset. Finally, this paper summarizes the current difficulties and challenges of instance segmentation, presents the solution ideas for facing the challenges, and makes a prospect for future research directions.
    Reference | Related Articles | Metrics
    Abstract1236
    PDF1389
    Review of Knowledge Distillation in Convolutional Neural Network Compression
    MENG Xianfa, LIU Fang, LI Guang, HUANG Mengmeng
    Journal of Frontiers of Computer Science and Technology    2021, 15 (10): 1812-1829.   DOI: 10.3778/j.issn.1673-9418.2104022

    In recent years, convolutional neural network (CNN) has made remarkable achievements in many applications in the field of image analysis with its powerful ability of feature extraction and expression. However, the continuous improvement of CNN performance is almost entirely due to the deeper and larger network model. In this case, the deployment of a complete CNN often requires huge memory overhead and high-performance computing units (such as GPU) support. However, there are limitations in the wide application of CNN in embedded devices with limited computing resources and mobile terminals with high real-time requirements. Therefore, CNN urgently needs network lightweight. At present, the main ways to solve the above problems are knowledge distillation, network pruning, parameter quantization, low rank decomposition, lightweight network design, etc. This paper first introduces the basic structure and development process of convolutional neural network, and briefly describes and compares five typical basic methods of network compression. Then, the knowledge distillation methods are combed and summarized in detail, and the different methods are compared experimentally on the CIFAR data set. Furthermore, the current evaluation system of knowledge distillation methods is introduced. The comparative analysis and evaluation of many types of methods are given. Finally, the preliminary thinking on the future development of this technology is given.

    Reference | Related Articles | Metrics
    Abstract1194
    PDF1799
    Research Progress of Lightweight Neural Network Convolution Design
    MA Jinlin, ZHANG Yu, MA Ziping, MAO Kaiji
    Journal of Frontiers of Computer Science and Technology    2022, 16 (3): 512-528.   DOI: 10.3778/j.issn.1673-9418.2107056

    Traditional neural networks have the disadvantages of over-reliance on hardware resources and high requirements for application equipment performance. Therefore, they cannot be deployed on edge devices and mobile terminals with limited computing power. The application development of artificial intelligence technology is limited to a certain extent. However, with the advent of the technological age, artificial intelligence, which is affected by user requirements, urgently needs to be able to successfully perform operations such as computer vision applications on portable devices. For this reason, this paper takes the convolution of popular lightweight neural networks in recent years as the research object. Firstly, by introducing the concept of lightweight neural network, the development status of lightweight neural networks and the problems faced by convolution in the network are introduced. Secondly, the convolution is divided into three aspects: lightweight of convolution structure, lightweight of convolution module and lightweight of convolution operation, specifically through the study of the convolution design in various lightweight neural network models, the lightweight effects of different convolutions are demonstrated, and the advantages and disadvantages of the optimization methods are explained. Finally, the main ideas and usage methods of all lightweight model convolutional design in this paper are summarized and analyzed, and their possible future development is prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1193
    PDF1280
    HTML380
    Survey of Face Synthesis
    FEI Jianwei, XIA Zhihua, YU Peipeng, DAI Yunshu
    Journal of Frontiers of Computer Science and Technology    2021, 15 (11): 2025-2047.   DOI: 10.3778/j.issn.1673-9418.2105059

    Face synthesis is one of the hot topics in the field of computer vision because of its application and technical value. In recent years, the breakthrough of deep learning has attracted much attention in this field. This paper divides the research in this field into four subcategories: face identity synthesis, face movements synthesis, face attributes synthesis and face generation, and systematically summarizes the development process, status quo, and existing problems of these subcategories. First of all, for face identity synthesis, three approaches are summarized, including computer graphics, digital image processing and deep learning. This paper summarizes their respective routine processes, and analyzes the technical principles of milestone work in detail. Secondly, face movements synthesis is further divided into label driven expression editing and real face driven face reenactment, where the shortcomings and problems in each field are pointed out. Then, the development of face attribute synthesis based on generative model is introduced, especially generative adversarial network. Finally, this paper briefly describes all kinds of researches on face generation. In addition, this paper also introduces the practical application and related problems of face synthesis field and provides the possible research direction in this field.

    Reference | Related Articles | Metrics
    Abstract1180
    PDF984
    Survey of Research on EEG Signal Emotion Recognition
    WANG Zhongmin, ZHAO Yupeng, ZHENG Ronglin, HE Yan, ZHANG Jiawen, LIU Yang
    Journal of Frontiers of Computer Science and Technology    2022, 16 (4): 760-774.   DOI: 10.3778/j.issn.1673-9418.2107006

    Emotion recognition refers to the recognition of a person’s emotional state through information such as facial expressions, behavioral actions or physiological signals, and its results have great application value in medical assistance, education, traffic safety and other areas. Due to the objective and realistic characteristics of EEG signals, the research on emotion recognition using EEG signals has received a lot of attention from scholars at home and abroad. A large amount of literature related to EEG emotion recognition has been reviewed, analyzed and summarized. Firstly, the theoretical knowledge of emotions and the definition of emotion recognition, the classification model of emotions, the acquisition and pre-processing of EEG signals are explained and analyzed in detail, and a general framework of EEG emotion recognition is given. Secondly, the extraction methods of various types of EEG features used for emotion recognition are reviewed from four aspects: time domain features, frequency domain features, time-frequency features and non-linear features. The construction of brain functional networks and the extraction methods of brain network attributes are introduced, and the advantages and disadvantages of each type of features and methods are analyzed. Then, the characteristics, advantages and disadvantages as well as the applicable scenarios of the commonly used classification algorithms in EEG emotion recognition are analyzed. Finally, the current difficulties and future directions of the field are summarized and outlined. It can help researchers to systematically understand the current status of research on EEG signal-based emotion recognition and provide ideas for the subsequent development of related research.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1176
    PDF923
    HTML293
    Overview of Facial Deepfake Video Detection Methods
    ZHANG Lu, LU Tianliang, DU Yanhui
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 1-26.   DOI: 10.3778/j.issn.1673-9418.2205035
    The illegal use of deepfake technology will have a serious impact on social stability, personal reputation and even national security. Therefore, it is imperative to develop research on facial deepfake videos detection tech-nology, which is also a research hotspot in the field of computer vision in recent years. At present, the research is based on traditional face recognition and image classification technology, building a deep neural network to deter-mine a facial video is real or not, but there are still problems such as the low quality of dataset, the combine of multimodal features and the poor performance of model generalization. In order to further promote the development of deepfake video detection technology, a comprehensive summary of various current algorithms is carried out, and the existing algorithms are classified, analyzed and compared. Firstly, this paper mainly introduces the facial deepfake videos detection datasets. Secondly, taking feature selection as the starting point, this paper summarizes the main method of detecting deepfake videos in the past three years, classifies various detection technologies from the pers-pectives of spatial features, spatial-temporal fusion features and biological features, and introduces some new detec-tion methods based on watermarking and blockchain. Then, this paper introduces the new trends of facial deepfake video detection methods from the aspects of feature selection, transfer learning, model architecture and training ideas. Finally, the full text is summarized and the future technology development is prospected.
    Reference | Related Articles | Metrics
    Abstract1175
    PDF1452
    Survey on 3D Reconstruction Methods Based on Visual Deep Learning
    LI Mingyang, CHEN Wei, WANG Shanshan, LI Jie, TIAN Zijian, ZHANG Fan
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 279-302.   DOI: 10.3778/j.issn.1673-9418.2205054
    In recent years, as one of the important tasks of computer vision, 3D reconstruction has received extensive attention. This paper focuses on the research progress of using deep learning to reconstruct the 3D shape of general objects in recent years. Taking the steps of 3D reconstruction by deep learning as the context, according to the data feature representation in the process of 3D reconstruction, it is divided into voxel, point cloud, surface mesh and implicit surface. Then, according to the number of inputting 2D images, it can be divided into single view 3D reconstruction and multi-view 3D reconstruction, which are subdivided according to the network architecture and the training mechanism they use. While the research progress of each category is discussed, the development prospects, advantages and disadvantages of each training method are analyzed. This paper studies the new hotspots in specific 3D reconstruction fields in recent years, such as 3D reconstruction of dynamic human bodies and 3D completion of incomplete geometric data, compares some key papers and summarizes the problems in these fields. Then this paper introduces the key application scenarios and parameters of 3D datasets at this stage. The development prospect of 3D reconstruction in specific application fields in the future is illustrated and analyzed, and the research direction of 3D reconstruction is prospected.
    Reference | Related Articles | Metrics
    Abstract1163
    PDF1206
    Improved YOLOv5 Traffic Light Real-Time Detection Robust Algorithm
    QIAN Wu, WANG Guozhong, LI Guoping
    Journal of Frontiers of Computer Science and Technology    2022, 16 (1): 231-241.   DOI: 10.3778/j.issn.1673-9418.2105033

    Traffic light detection algorithm, a critical procedure for realization of automatic driving, is directly related to the driving safety of intelligent vehicles. However, due to the small size of traffic lights and complicated environment, the algorithm research meets plenty of difficulties. This paper puts forward a traffic light detection algorithm based on optimized YOLOv5. Firstly, it uses a visible label ratio to determine the model input. Secondly, the ACBlock structure is introduced to increase the feature extraction ability of the backbone network; the SoftPool is designed to reduce the sample loss of the backbone network and the DSConv convolution kernel is used to reduce the model parameters. Finally, a memory feature fusion network is designed to efficiently utilize high level semantic information and low level features. As a result, the improvement of model input and backbone network directly improves the feature extraction ability of the model in complex environment; the improvement of feature fusion network enables the model to make full use of feature information and increase the accuracy of target positioning and boundary regression. Experimental results show that, the proposed algorithm achieves 74.3% AP and 111 frame/s detection speed on BDD100K, which is 11.0 percentage points higher than the AP of YOLOv5. In Bosch data set, 84.4% AP and 126 frame/s detection speed are obtained, which is 9.3 percentage points higher than the AP of YOLOv5. The robustness test results show that the proposed algorithm has significantly improved the detection ability of tar-gets in a variety of complex environments, and the robustness is increased to achieve high-precision real-time detection.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1130
    PDF729
    HTML272
    Review of Chinese Named Entity Recognition Research
    WANG Yingjie, ZHANG Chengye, BAI Fengbo, WANG Zumin, JI Changqing
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 324-341.   DOI: 10.3778/j.issn.1673-9418.2208028
    With the rapid development of related technologies in the field of natural language processing, as an upstream task of natural language processing, improving the accuracy of named entity recognition is of great significance for subsequent text processing tasks. However, due to the differences between Chinese and English languages, it is difficult to transfer the research results of English named entity recognition into Chinese research effectively. Therefore, the key issues in the current research of Chinese named entity recognition are analyzed from the following four aspects: Firstly, the development of named entity recognition is taken as the main clue, the advantages and disadvantages, common methods and research results of each stage are comprehensively discussed. Secondly, the Chinese text preprocessing methods are summarized from the perspective of sequence annotation, evaluation index, Chinese word segmentation methods and datasets. Then, aiming at the Chinese character and word feature fusion method, the current research is summarized from the perspective of character fusion and word fusion, and the optimization direction of the current Chinese named entity recognition model is discussed. Finally, the practical applications of Chinese named entity recognition in various fields are analyzed. This paper discusses the current research on Chinese named entity recognition, aiming to help researchers understand the research direction and significance of this task more comprehensively, so as to provide a certain reference for proposing new methods and new improvements.
    Reference | Related Articles | Metrics
    Abstract1123
    PDF1184
    XR-MSF-Unet: Automatic Segmentation Model for COVID-19 Lung CT Images
    XIE Juanying, ZHANG Kaiyun
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1850-1864.   DOI: 10.3778/j.issn.1673-9418.2203023

    The COVID-19 epidemic has threatened the human being. The automatic and accurate segmentation for the infected area of the COVID-19 CT images can help doctors to make correct diagnosis and treatment in time. However, it is very challenging to achieve perfect segmentation due to the diffuse infections of the COVID-19 to the patient lungs and irregular shapes of the infected areas and very similar infected areas to other lung tissues. To tackle these challenges, the XR-MSF-Unet model is proposed in this paper for segmenting the COVID-19 lung CT images of patients. The XR (X ResNet) convolution module is proposed in this model to replace the two-layer convolution operations of U-Net, so as to extract more informative features for achieving good segmentation results by multiple branches of XR. The plug and play attention mechanism module MSF (multi-scale features fusion module) is proposed in XR-MSF-Unet to fuse multi-scale features from different scales of reception fields, global, local and spatial features of CT images, so as to strengthen the detail segmentation effect of the model. Extensive experiments on the public COVID-19 CT images demonstrate that the proposed XR module can strengthen the capability of the XR-MSF-Unet model to extract effective features, and the MSF module plus XR module can effectively improve the segmentation capability of the XR-MSF-Unet model for the infected areas of the COVID-19 lung CT images. The proposed XR-MSF-Unet model obtains good segmentation results. Its segmentation perfor-mance is superior to that of the original U-Net model by 3.21, 5.96, 1.22 and 4.83 percentage points in terms of Dice, IOU, F1-Score and Sensitivity, and it defeats other same type of models, realizing automatic segmentation to the COVID-19 lung CT images.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract1097
    PDF372
    HTML123