Most Download articles

    Published in last 1 year| In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Published in last 1 year
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Review of Deep Learning Applied to Time Series Prediction
    LIANG Hongtao, LIU Shuo, DU Junwei, HU Qiang, YU Xu
    Journal of Frontiers of Computer Science and Technology    2023, 17 (6): 1285-1300.   DOI: 10.3778/j.issn.1673-9418.2211108
    The time series is generally a set of random variables that are observed and collected at a certain frequency in the course of something??s development. The task of time series forecasting is to extract the core patterns from a large amount of data and to make accurate estimates of future data based on known factors. Due to the access of a large number of IoT data collection devices, the explosive growth of multidimensional data and the increasingly demanding requirements for prediction accuracy, it is difficult for classical parametric models and traditional machine learning algorithms to meet high efficiency and high accuracy requirements of prediction tasks. In recent years, deep learning algorithms represented by convolutional neural networks, recurrent neural networks and Trans-former models have achieved fruitful results in time series forecasting tasks. To further promote the development of time series prediction technology, common characteristics of time series data, evaluation indexes of datasets and models are reviewed, and the characteristics, advantages and limitations of each prediction algorithm are experimentally compared and analyzed with time and algorithm architecture as the main research line. Several time series prediction methods based on Transformer model are highlighted and compared. Finally, according to the problems and challenges of deep learning applied to time series prediction tasks, this paper provides an outlook on the future research trends in this direction.
    Reference | Related Articles | Metrics
    Review on Named Entity Recognition
    LI Dongmei, LUO Sisi, ZHANG Xiaoping, XU Fu
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 1954-1968.   DOI: 10.3778/j.issn.1673-9418.2112109

    In the field of natural language processing, named entity recognition is the first key step of information extraction. Named entity recognition task aims to recognize named entities from a large number of unstructured texts and classify them into predefined types. Named entity recognition provides basic support for many natural language processing tasks such as relationship extraction, text summarization, machine translation, etc. This paper first introduces the definition of named entity recognition, research difficulties, particularity of Chinese named entity recognition, and summarizes the common Chinese and English public datasets and evaluation criteria in named entity recognition tasks. Then, according to the development history of named entity recognition, the existing named entity recognition methods are investigated, which are the early named entity recognition methods based on rules and dictionaries, the named entity recognition methods based on statistic and machine learning, and the named entity recognition methods based on deep learning. This paper summarizes the key ideas, advantages and disadvan-tages and representative models of each named entity recognition method, and summarizes the Chinese named entity recognition methods in each stage. In particular, the latest named entity recognition based on Transformer and based on prompt learning are reviewed, which are state-of-the-art in deep learning-based named entity recognition methods. Finally, the challenges and future research trends of named entity recognition are discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Few-Shot Object Detection
    LIU Chunlei, CHEN Tian‘en, WANG Cong, JIANG Shuwen, CHEN Dong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 53-73.   DOI: 10.3778/j.issn.1673-9418.2206020
    Object detection as a hot field in computer vision, usually requires a large number of labeled images for model training, which will cost a lot of manpower and material resources. At the same time, due to the inherent long-tailed distribution of data in the real world, the number of samples of most objects is relatively small, such as many uncommon diseases, etc., and it is difficult to obtain a large number of labeled images. In this regard, few-shot object detection only needs to provide a small amount of annotation information to detect objects of interest. This paper makes a detailed review of few-shot object detection methods. Firstly, the development of general target detection and its existing problems are reviewed, the concept of few-shot object detection is introduced, and other tasks related to few-shot object detection are differentiated and explained. Then, two classical paradigms based on transfer learning and meta-learning for existing few-shot object detection are introduced. According to the improvement strategies of different methods, few-shot object detection is divided into four types: attention mechanism, graph convolutional neural network, metric learning and data augmentation. The public datasets and evaluation metrics used in these methods are explained. Advantages, disadvantages, applicable scenarios of different methods, and performance on different datasets are compared and analyzed. Finally, the practical application fields and future research trends of few-shot object detection are discussed.
    Reference | Related Articles | Metrics
    Survey on Cross-Chain Protocols of Blockchain
    MENG Bo, WANG Yibing, ZHAO Can, WANG Dejun, MA Binhao
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2177-2192.   DOI: 10.3778/j.issn.1673-9418.2203032

    With the development of blockchain technology, due to the different system architecture and application scenarios of blockchain platforms, it is difficult to realize the interconnection and intercommunication of data and assets on different blockchains, which affects the promotion and application of blockchain. The cross-chain tech-nology of blockchain is an important technical solution to realize the interconnection of blockchain and improve the interoperability and extensibility of blockchain. The blockchain cross-chain protocol is the specific design specifi-cations to realize the cross-chain interoperability between different blockchains through cross-chain technology, so it is of great significance to the realization of blockchain interoperability and the construction of blockchain cross-chain application. This paper systematically arranges and analyzes the latest researches on the integration and implemen-tation of blockchain cross-chain protocols, and places them in four hierarchies: Firstly, the current research status of blockchain cross-chain interoperability is explained from three aspects, Internet of blockchains, cross-chain techno-logy and blockchain interoperability. Secondly, the cross-chain protocols are divided into cross-chain communi-cation agreements, cross-chain asset transaction agreements and cross-chain smart contract call agreements, and the latest research is analyzed. Thirdly, the key design principles of cross-chain protocols are summarized, and the solutions for the problems of security, privacy and scalability of cross-chain protocol are provided. Finally, com-bined with the actual needs of blockchain cross-chain applications, the future research direction of blockchain cross-chain protocol is given.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Camouflaged Object Detection Based on Deep Learning
    SHI Caijuan, REN Bijuan, WANG Ziwen, YAN Jinwei, SHI Ze
    Journal of Frontiers of Computer Science and Technology    2022, 16 (12): 2734-2751.   DOI: 10.3778/j.issn.1673-9418.2206078

    Camouflaged object detection (COD) based on deep learning is an emerging visual detection task, which aims to detect the camouflaged objects “perfectly” embedded in the surrounding environment. However, most exiting work primarily focuses on building different COD models with little summary work for the existing methods. Therefore, this paper summarizes the existing COD methods based on deep learning and discusses the future development of COD. Firstly, 23 existing COD models based on deep learning are introduced and analyzed according to five detection mechanisms: coarse-to-fine strategy, multi-task learning strategy, confidence-aware learning strategy, multi-source information fusion strategy and transformer-based strategy. The advantages and disadvantages of each strategy are analyzed in depth. And then, 4 widely used datasets and 4 evaluation metrics for COD are introduced. In addition, the performance of the existing COD models based on deep learning is compared on four datasets, including quantitative comparison, visual comparison, efficiency analysis, and the detection effects on camouflaged objects of different types. Furthermore, the practical applications of COD in medicine, industry, agriculture, military, art, etc. are mentioned. Finally, the deficiencies and challenges of existing methods in complex scenes, multi-scale objects, real-time performance, practical application requirements, and COD in other multimodalities are pointed out, and the potential directions of COD are discussed.

    Table and Figures | Reference | Related Articles | Metrics
    Review of Knowledge-Enhanced Pre-trained Language Models
    HAN Yi, QIAO Linbo, LI Dongsheng, LIAO Xiangke
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1439-1461.   DOI: 10.3778/j.issn.1673-9418.2108105

    The knowledge-enhanced pre-trained language models attempt to use the structured knowledge stored in the knowledge graph to strengthen the pre-trained language models, so that they can learn not only the general semantic knowledge from the free text, but also the factual entity knowledge behind the text. In this way, the enhanced models can effectively solve downstream knowledge-driven tasks. Although this is a promising research direction, the current works are still in the exploratory stage, and there is no comprehensive summary and systematic arrangement. This paper aims to address the lack of comprehensive reviews of this direction. To this end, on the basis of summarizing and sorting out a large number of relevant works, this paper firstly explains the background information from three aspects: the reasons, the advantages, and the difficulties of introducing knowledge, summarizes the basic concepts involved in the knowledge-enhanced pre-trained language models. Then, it discusses three types of knowledge enhancement methods: using knowledge to expand input features, using knowledge to modify model architecture, and using knowledge to constrain training tasks. Finally, it counts the scores of various knowledge enhanced pre-trained language models on several evaluation tasks, analyzes the performance, the current challenges, and possible future directions of knowledge-enhanced pre-trained language models.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Deep Online Multi-object Tracking Algorithms
    LIU Wenqiang, QIU Hangping, LI Hang, YANG Li, LI Yang, MIAO Zhuang, LI Yi, ZHAO Xinxin
    Journal of Frontiers of Computer Science and Technology    2022, 16 (12): 2718-2733.   DOI: 10.3778/j.issn.1673-9418.2204041

    Video multi-object tracking is a key task in the field of computer vision and has a wide application prospect in industry, commerce and military fields. At present, the rapid development of deep learning provides many solutions to solve the problem of multi-object tracking. However, the challenging problems such as mutation of target appearance, serious occlusion of target area, disappearance and appearance of target have not been completely solved. This paper focuses on online multi-object tracking algorithm based on deep learning, and summarizes the latest progress in this field. According to the three important modules of feature prediction, apparent feature extraction and data association, as will as the two frameworks of detection-based-tracking (DBT) and joint-detection-tracking (JDT), this paper divides deep online multi-object tracking algorithms into six sub-classes, and discusses the principles, advantages and disadvantages of different types of algorithms. Among them, the multi-stage design of the DBT algorithm has a clear structure and is easy to optimize, but multi-stage training may lead to sub-optimal solutions; the sub-modules of the JDT algorithm that integrates detection and tracking achieve faster inference speed, but there is a problem of collaborative training of each module. Currently, multi-target tracking begins to focus on long-term feature extraction of targets, occlusion target processing, association strategy improvement, and end-to-end framework design. Finally, combined with the existing algorithms, this paper summarizes urgent problems to be solved in deep online multi-object tracking and looks forward to possible research directions in the future.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Research on Image Inpainting Methods
    LUO Haiyin, ZHENG Yuhui
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2193-2218.   DOI: 10.3778/j.issn.1673-9418.2204101

    Image inpainting refers to restoring the pixels in damaged areas of an image to make them as consistent as possible with the original image. Image inpainting is not only crucial in the computer vision tasks, but also serves as an important cornerstone of other image processing tasks. However, there are few researches related to image inpainting. In order to better learn and promote the research of image inpainting tasks, the classic image inpainting algorithms and representative deep learning image inpainting methods in the past ten years are reviewed and analyzed. Firstly, the classical traditional image inpainting methods are briefly summarized, and divided into partial differential equation-based and sample-based image inpainting methods, and the limitations of traditional image methods are further analyzed. Deep learning image inpainting methods are divided into single image inpainting and pluralistic image inpainting according to the number of output images of the model, and different methods are analyzed and summarized in combination with application images, loss functions, types, advantages, and limitations. After that, the commonly used datasets and quantitative evaluation indicators of image inpainting methods are described in detail, and the quantitative data of image inpainting methods to inpaint damaged areas of different areas on different image datasets are given. According to the quantitative data, the performance of image inpainting methods based on deep learning is compared and analyzed. Finally, the limitations of existing image inpainting methods are summarized and analyzed, and new ideas and prospects for future key research directions are proposed.

    Table and Figures | Reference | Related Articles | Metrics
    Review of Graph Neural Networks Applied to Knowledge Graph Reasoning
    SUN Shuifa, LI Xiaolong, LI Weisheng, LEI Dajiang, LI Sihui, YANG Liu, WU Yirong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 27-52.   DOI: 10.3778/j.issn.1673-9418.2207060
    As an important element of knowledge graph construction, knowledge reasoning (KR) has always been a hot topic of research. With the deepening of knowledge graph application research and the expanding of its scope, graph neural network (GNN) based KR methods have received extensive attention due to their capability of obtaining semantic information such as entities and relationships in knowledge graph, high interpretability, and strong reasoning ability. In this paper, firstly, basic knowledge and research status of knowledge graph and KR are summarized. The advantages and disadvantages of KR approaches based on logic rules, representation learning, neural network and graph neural network are briefly introduced. Secondly, the latest progress in KR based on GNN is comprehensively summarized. GNN-based KR methods are categorized into knowledge reasoning based on recurrent graph neural networks (RecGNN), convolutional graph neural networks (ConvGNN), graph auto-encoders (GAE) and spatial-temporal graph neural networks (STGNN). Various typical network models are introduced and compared. Thirdly, this paper introduces the application of KR based on graph neural network in health care, intelligent manufacturing, military, transportation, etc. Finally, the future research directions of GNN-based KR are proposed, and related research in various directions in this rapidly growing field is discussed.
    Reference | Related Articles | Metrics
    Survey on Applications of Knowledge Graph Embedding in Recommendation Tasks
    TIAN Xuan, CHEN Hangxue
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1681-1705.   DOI: 10.3778/j.issn.1673-9418.2112070

    Recommendation systems are designed to recommend personalized content to improve user experience. At present, the recommendation systems still face some challenges such as poor interpretability, cold start problem and serialized recommendation modeling. Recently, the knowledge graph (KG) containing a large amount of semantic and structural information has been widely used in a variety of different recommendation tasks to alleviate the above problems. This paper systematically reviews the innovative applications of knowledge graph embedding (KGE) in different recommendation tasks. It first summarizes three common recommendation tasks and four applying goals of knowledge graph embedding. Then, it generalizes four types of knowledge graph embedding methods according to specific technologies, including traditional embedding method, embedding propagation method, heterogeneous graph embedding method and graph neural network based method. It further elaborates on the applying characteristics and strategies of the above four methods in different recommendation tasks, and evaluates advantages and limitations of each method. Also, it conducts qualitative and quantitative analysis of the associations and differences of four methods from multiple aspects. Finally, it puts forward some views on the development trend of applying knowledge graph embedding for recommendation systems, and proposes several noteworthy development directions in the future from multiple perspectives.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Deep Learning Based Multimodal Emotion Recognition
    ZHAO Xiaoming, YANG Yijiao, ZHANG Shiqing
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1479-1503.   DOI: 10.3778/j.issn.1673-9418.2112081

    Multimodal emotion recognition aims to recognize human emotional states through different modalities related to human emotion expression such as audio, vision, text, etc. This topic is of great importance in the fields of human-computer interaction, a.pngicial intelligence, affective computing, etc., and has attracted much attention. In view of the great success of deep learning methods developed in recent years in various tasks, a variety of deep neural networks have been used to learn high-level emotional feature representations for multimodal emotion recog-nition. In order to systematically summarize the research advance of deep learning methods in the field of multi-modal emotion recognition, this paper aims to present comprehensive analysis and summarization on recent multi-modal emotion recognition literatures based on deep learning. First, the general framework of multimodal emotion recognition is given, and the commonly used multimodal emotional dataset is introduced. Then, the principle of representative deep learning techniques and its advance in recent years are briefly reviewed. Subsequently, this paper focuses on the advance of two key steps in multimodal emotion recognition: emotional feature extraction methods related to audio, vision, text, etc., including hand-crafted feature extraction and deep feature extraction; multi-modal information fusion strategies integrating different modalities. Finally, the challenges and opportunities in this field are analyzed, and the future development direction is pointed out.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on 3D Reconstruction Methods Based on Visual Deep Learning
    LI Mingyang, CHEN Wei, WANG Shanshan, LI Jie, TIAN Zijian, ZHANG Fan
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 279-302.   DOI: 10.3778/j.issn.1673-9418.2205054
    In recent years, as one of the important tasks of computer vision, 3D reconstruction has received extensive attention. This paper focuses on the research progress of using deep learning to reconstruct the 3D shape of general objects in recent years. Taking the steps of 3D reconstruction by deep learning as the context, according to the data feature representation in the process of 3D reconstruction, it is divided into voxel, point cloud, surface mesh and implicit surface. Then, according to the number of inputting 2D images, it can be divided into single view 3D reconstruction and multi-view 3D reconstruction, which are subdivided according to the network architecture and the training mechanism they use. While the research progress of each category is discussed, the development prospects, advantages and disadvantages of each training method are analyzed. This paper studies the new hotspots in specific 3D reconstruction fields in recent years, such as 3D reconstruction of dynamic human bodies and 3D completion of incomplete geometric data, compares some key papers and summarizes the problems in these fields. Then this paper introduces the key application scenarios and parameters of 3D datasets at this stage. The development prospect of 3D reconstruction in specific application fields in the future is illustrated and analyzed, and the research direction of 3D reconstruction is prospected.
    Reference | Related Articles | Metrics
    Overview of Facial Deepfake Video Detection Methods
    ZHANG Lu, LU Tianliang, DU Yanhui
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 1-26.   DOI: 10.3778/j.issn.1673-9418.2205035
    The illegal use of deepfake technology will have a serious impact on social stability, personal reputation and even national security. Therefore, it is imperative to develop research on facial deepfake videos detection tech-nology, which is also a research hotspot in the field of computer vision in recent years. At present, the research is based on traditional face recognition and image classification technology, building a deep neural network to deter-mine a facial video is real or not, but there are still problems such as the low quality of dataset, the combine of multimodal features and the poor performance of model generalization. In order to further promote the development of deepfake video detection technology, a comprehensive summary of various current algorithms is carried out, and the existing algorithms are classified, analyzed and compared. Firstly, this paper mainly introduces the facial deepfake videos detection datasets. Secondly, taking feature selection as the starting point, this paper summarizes the main method of detecting deepfake videos in the past three years, classifies various detection technologies from the pers-pectives of spatial features, spatial-temporal fusion features and biological features, and introduces some new detec-tion methods based on watermarking and blockchain. Then, this paper introduces the new trends of facial deepfake video detection methods from the aspects of feature selection, transfer learning, model architecture and training ideas. Finally, the full text is summarized and the future technology development is prospected.
    Reference | Related Articles | Metrics
    Survey of Few-Shot Image Classification Research
    AN Shengbiao, GUO Yuqi, BAI Yu, WANG Tengbo
    Journal of Frontiers of Computer Science and Technology    2023, 17 (3): 511-532.   DOI: 10.3778/j.issn.1673-9418.2210035
    In recent years, artificial intelligence algorithms represented by deep learning have achieved success in many fields by relying on large-scale datasets and huge computing resources. Among them, the image classification technology in the field of computer vision develops vigorously, and many mature visual task classification models emerge. All these models need to use a large number of annotated samples for training. However, in actual scena-rios, due to many restrictions, the amount of data is scarce, and it is often difficult to obtain high-quality annotated samples of corresponding scale. Therefore, how to use a small number of samples for learning has gradually become a research hotspot. In view of the classification task system, this paper reviews the current work related to few-shot image classification. Few-shot learning mainly adopts deep learning methods such as meta-learning, metric learning and data enhancement. This paper summarizes the research progress and typical technical models of few-shot image classification from supervised, semi-supervised and unsupervised levels, as well as the performance of these model methods on several public datasets, and makes comparative analysis from the mechanism, advantages, limitations, etc. Finally, the technical difficulties and future trends of few-shot image classification are discussed.
    Reference | Related Articles | Metrics
    Target Tracking System Constructed by ELM-AE and Transfer Representation Learning
    YANG Zheng, DENG Zhaohong, LUO Xiaoqing, GU Xin, WANG Shitong
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1633-1648.   DOI: 10.3778/j.issn.1673-9418.2012028

    In the target tracking algorithm, the feature model’s ability to quickly learn image features and the ability to adapt to changes in target features during tracking has always been one of the main research directions of target tracking algorithms. Especially for discriminative target trackers based on image block learning, these two points have become decisive factors affecting the efficiency and robustness of the tracker. However, the performance of most existing similar algorithms on these two abilities cannot achieve satisfactory results. To solve this problem, an efficient and robust feature model is proposed. The feature model first uses extreme learning machine autoencoder (ELM-AE) to quickly perform random feature mapping on complex image features of the target and background image blocks, and then uses the transfer learning ability of transfer representation learning (TRL) to improve the adaptability of random feature space. The feature model is named transfer representation learning with ELM-AE (TRL-ELM-AE). Compared with original complex image features, this model can provide the classifier with more compact and expressive shared features, so that the classifier can learn and classify more quickly and efficiently. In addition, in the target tracking process, the target and background usually change continuously over time. Although the feature migration capability of TRL can already adapt to this, in order to further improve the robustness of the tracker, a strategy of dynamically updating training samples is adopted. Through a large number of experimental and analysis results on the 11 target tracking challenge scenarios proposed by OTB, it is proven that the proposed target tracker has significant advantages over the existing target tracker.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Graph Neural Network in Recommendation System
    WU Jing, XIE Hui, JIANG Huowen
    Journal of Frontiers of Computer Science and Technology    2022, 16 (10): 2249-2263.   DOI: 10.3778/j.issn.1673-9418.2203004

    Recommendation system (RS) was introduced because of a lot of information. Due to the diversity, complexity, and sparseness of data, traditional recommendation system can not solve the current problem well. Graph neural network (GNN) can extract and represent the features from edges and nodes data in the graphs and has inherent advantages in processing the graphs structure data, so it flourishes in recommendation system. This paper sorts out the main references of graph neural network in recommendation system in recent years, focuses on the two perspectives of method and problem, and systematically reviews graph neural network in recommendation system. Firstly, from the method level, five graph neural networks of the recommendation system are elaborated, including the graph convolutional network in the recommendation system, graph attention network in the recommendation system, graph autoencoder in the recommendation system, graph generation network in the recommendation system and graph spatial-temporal network in the recommendation system. Secondly, from the perspective of problem similarity, six major problem types are summarized: sequence recommendation, social recommendation, cross-domain recommendation, multi-behavior recommendation, bundle recommendation, and session-based recommen-dation. Finally, based on the analysis and summary of the existing methods, this paper points out the main difficu-lties in the current research on graph neural network in recommendation system, proposes the corresponding issues that can be investigated, and looks forward to the future research directions on this topic.

    Table and Figures | Reference | Related Articles | Metrics
    Knowledge Graph Link Prediction Based on Subgraph Reasoning
    YU Huilin, CHEN Wei, WANG Qi, GAO Jianwei, WAN Huaiyu
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1800-1808.   DOI: 10.3778/j.issn.1673-9418.2104084

    Relationship prediction in knowledge graph aims to identify and infer new relationships from existing data, and provides knowledge services for many downstream tasks. At present, many researches solve the link prediction problem between entities by mapping entities and relations into a vector space or searching the paths between entities. These methods only consider the influence of single path or first-order information but ignore more complex relation information between entities. Therefore, this paper proposes a novel link prediction method based on subgraph reasoning in knowledge graph, uses the subgraph structure to obtain the entity pair neighborhood structure information, combines the advantages of representation learning and path reasoning, and realizes the relationship prediction between entities. This paper first extends the paths between entities to subgraphs, constructs node subgraph and relationship subgraph from entity level and relationship level respectively, then combines the graph embedding representation with the graph neural network to calculate the subgraph features, to get richer entity characteristics and relationship characteristics. Finally, this paper calculates the neighborhood structure information of entity pairs from the subgraph structure to conduct link prediction between entities. Experimental results demons-trate that the proposed approach outperforms other reasoning-based link prediction methods on two benchmark datasets.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-scale Selection Pyramid Networks for Small-Sample Target Detection Algorithms
    PENG Hao, LI Xiaoming
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1649-1660.   DOI: 10.3778/j.issn.1673-9418.2109081

    Target detection is to detect the specified target in the image. This technology has been widely used in automatic driving, face recognition and other fields, and has become a major research hotspot in the field of computer vision at home and abroad. Traditional target detection often requires a large number of annotated datasets, so it is a challenge to detect targets with only a small number of annotated samples. To address this problem, this paper proposes a multi-scale selection pyramid network algorithm for small sample target detection so that detection no longer relies on large-scale labeled datasets. Firstly, this paper designs a multi-scale selection pyramid network for small sample target detection, which consists of three components: context layer attention module, feature scale enhancement module, and feature scale selection module. Secondly, this paper performs feature fusion after the RoI features generated by the RPN network using maximum pooling and average pooling to improve the correlation between features. This paper uses feature subtraction to highlight the category information in the features, which can improve the sensitivity to new class parameters while maintaining the stability of the model to the sample parameters. Finally, the orthogonal mapping loss function is used to constrain the features before the classification layer, which can well measure the similarity between features even in the case of a small number of samples.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Sign Language Recognition and Translation
    YAN Siyi, XUE Wanli, YUAN Tiantian
    Journal of Frontiers of Computer Science and Technology    2022, 16 (11): 2415-2429.   DOI: 10.3778/j.issn.1673-9418.2205003

    Different from spoken languages, sign language is mainly composed of continuous gestures. Sign langu-age recognition and translation are important means of facilitating barrier-free communication between the hearing-impaired and the hearing person. The sign language recognition and translation research task is a typical multi-domain cross-study by processing and analyzing sign language videos and displaying the recognition results in text form. In recent years, sign language recognition and translation research based on deep learning has made great progress. In order to facilitate researchers to systematically and comprehensively understand the research tasks of sign language recognition and translation, the review work is carried out from the perspectives of sign language recognition and sign language translation. Firstly, the translation research work is classified and summarized and its characteristics are analyzed. Secondly, the common sign language recognition and translation research datasets of different countries are summarized and classified from the perspectives of isolated sign language words and continuous sign language sentences. Based on the difference in research tasks, the corresponding evaluation index system is introduced. Finally, the major challenges of current research on sign language recognition and translation are summarized from the aspects of effective information extraction of sign language visual features, multi-cue weight assignment, relationship between sign language and natural language grammar, and sign language dataset resources.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on 3D Human Pose Estimation of Deep Learning
    WANG Shichen, HUANG Kai, CHEN Zhigang, ZHANG Wendong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 74-87.   DOI: 10.3778/j.issn.1673-9418.2205070
    The purpose of 3D human pose estimation is to predict information such as the 3D coordinate position and angle of human joint points, and construct human representations (such as human bones) for further analysis of human posture. With the continuous advancement of deep learning methods, more and more high-performance 3D human pose estimation methods based on deep learning have been proposed. However, due to the human occlusion of the picture and the large demand for training scale, there are still challenges in 3D human pose estimation. The research purpose of this paper is to review a number of research papers in recent years, analyze and compare the reasoning process and core elements of these methods, and comprehensively elaborate the 3D human pose estimation methods based on deep learning in recent years. In addition, this paper also introduces the relevant data- sets and evaluation indicators, compares the experimental data of some models on the Human3.6M dataset, Campus dataset and Shelf dataset, and analyzes and compares the experimental results. Finally, according to the results of this survey, the difficulties and challenges faced by the current 3D human pose estimation are discussed, and the future development of 3D human pose estimation is discussed.
    Reference | Related Articles | Metrics
    Parallel Architecture Design for OpenVX Kernel Image Processing Functions
    PAN Fengrui, LI Tao, XING Lidong, ZHANG Haocong, WU Guanzhong
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1570-1582.   DOI: 10.3778/j.issn.1673-9418.2012085

    Although the traditional programmable processors are highly flexible, their processing speed and perfor-mance are inferior to the application specific integrated circuit (ASIC). Image processing is often a diverse, intensive and repetitive operation, so the processor must balance speed, performance and flexibility. OpenVX is an open source standard for preprocessing or auxiliary processing of image processing, graph computing and deep learning applications. Aiming at the kernel visual function library of OpenVX 1.3 standard, this paper designs and implements a programmable and extensible OpenVX parallel processor. The architecture adopts an application specific instruction processor (ASIP). After analyzing and comparing the topological characteristics of various interconnection networks, the backbone of the ASIP chooses the hierarchically cross-connected Mesh+ (HCCM+) with outstanding performance, and processing element (PE) is set at network nodes. PE array is constructed to support dynamic configuration, and a parallel processor is designed to realize programmable image processing based on efficient routing and com-munication. The proposed architecture is suitable for data parallel computing and emerging graph computing. The two computing modes can be configured separately or mixed. The kernel visual function and graph computing model are mapped to the parallel processor respectively to verify the two modes and compare the image processing speed under different PE numbers. The results show that OpenVX parallel processor can complete the mapping and linear speedup of kernel functions and high complexity graph calculation model. The average speedup of scheduling 16 PEs to various functions is approximately 15.0375. When implemented on an FPGA board with a 20 nm XCVU440 device, the prototype can run at a frequency of 125 MHz.

    Table and Figures | Reference | Related Articles | Metrics
    Survey of Question Answering Based on Knowledge Graph Reasoning
    SA Rina, LI Yanling, LIN Min
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1727-1741.   DOI: 10.3778/j.issn.1673-9418.2111033

    Knowledge graph question answering (KGQA) is based on analysis and understanding of questions and knowledge graph (KG) to obtain the answers. However, due to the complexity of natural language questions and the incompleteness of KG, the accuracy of answers can not be improved effectively. The knowledge graph reasoning technology can infer the missing entities in the KG and the implied relations between entities. Therefore, its application in KGQA can further improve the accuracy of answer prediction. In recent years, with the development of KGQA datasets and flexible application of knowledge graph reasoning technology, the development of the KGQA is greatly promoted. In this paper, question answering based on knowledge graph reasoning is summarized from three aspects. Firstly, this paper gives a brief overview of question answering based on knowledge graph reasoning, and introduces its challenges and related datasets. Secondly, this paper introduces the application of knowledge graph reasoning in open domain question answering, commonsense question answering and temporary knowledge question answering, and analyzes the advantages and disadvantages of each method. The open domain question answering methods are further summarized as graph embedding methods, deep learning methods and logic methods. Finally, this paper summarizes the work and prospects the future research in view of the current problems of question answering based on knowledge graph reasoning.

    Table and Figures | Reference | Related Articles | Metrics
    Review of Image Super-resolution Reconstruction Algorithms Based on Deep Learning
    YANG Caidong, LI Chengyang, LI Zhongbo, XIE Yongqiang, SUN Fangwei, QI Jin
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 1990-2010.   DOI: 10.3778/j.issn.1673-9418.2202063

    The essence of image super-resolution reconstruction technology is to break through the limitation of hardware conditions, and reconstruct a high-resolution image from a low-resolution image which contains less infor-mation through the image super-resolution reconstruction algorithms. With the development of the technology on deep learning, deep learning has been introduced into the image super-resolution reconstruction field. This paper summarizes the image super-resolution reconstruction algorithms based on deep learning, classifies, analyzes and compares the typical algorithms. Firstly, the model framework, upsampling method, nonlinear mapping learning module and loss function of single image super-resolution reconstruction method are introduced in detail. Secondly, the reference-based super-resolution reconstruction method is analyzed from two aspects: pixel alignment and Patch matching. Then, the benchmark datasets and image quality evaluation indices used for image super-resolution recon-struction algorithms are summarized, the characteristics and performance of the typical super-resolution recons-truction algorithms are compared and analyzed. Finally, the future research trend on the image super-resolution reconstruction algorithms based on deep learning is prospected.

    Table and Figures | Reference | Related Articles | Metrics
    Survey on Video Object Tracking Algorithms
    LIU Yi, LI Mengmeng, ZHENG Qibin, QIN Wei, REN Xiaoguang
    Journal of Frontiers of Computer Science and Technology    2022, 16 (7): 1504-1515.   DOI: 10.3778/j.issn.1673-9418.2111105

    Video object tracking is an important research content in the field of computer vision, mainly studying the tracking of objects with interest in video streams or image sequences. Video object tracking has been widely used in cameras and surveillance, driverless, precision guidance and other fields. Therefore, a comprehensive review on video object tracking algorithms is of great significance. Firstly, according to different sources of challenges, the challenges faced by video object tracking are classified into two aspects, the objects’ factors and the backgrounds’ factors, and summed up respectively. Secondly, the typical video object tracking algorithms in recent years are classified into correlation filtering video object tracking algorithms and deep learning video object tracking algorithms. And further the correlation filtering video object tracking algorithms are classified into three categories: kernel correlation filtering algorithms, scale adaptive correlation filtering algorithms and multi-feature fusion corre-lation filtering algorithms. The deep learning video object tracking algorithms are classified into two categories: video object tracking algorithms based on siamese network and based on convolutional neural network. This paper analyzes various algorithms from the aspects of research motivation, algorithm ideas, advantages and disadvantages. Then, the widely used datasets and evaluation indicators are introduced. Finally, this paper sums up the research and looks forward to the development trends of video object tracking in the future.

    Table and Figures | Reference | Related Articles | Metrics
    Small Object Detection Algorithm Based on Weighted Network
    CHEN Haoran, PENG Li, LI Wentao, DAI Feifei
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 2143-2150.   DOI: 10.3778/j.issn.1673-9418.2101040

    For the observation of a picture, people may instinctly pay more attention to the eye-catching objects in the picture. Usually such objects tend to occupy a larger proportion in the picture, which leads to small targets being ignored. Because the area where the small target is located is often a weak detection area, and the features that can be extracted in the process of extracting features by the detector are few and are easily lost in the process of feature information transmission after the feature is extracted, the effect of small target detection is not good. Therefore, on the basis of the single-order detector, this paper adds a cross-channel interaction mechanism to ensure the integrity of the information between layers, adopts target enhancement of training samples and designs a general loss function. Apart from this, this paper improves the sample weighting on the basis of the loss function to predict weight of samples. The mAP of this paper framework UWN (unified weighted network) on the VOC public dataset is 81.2% and the mAP on the self-made small target aerial photography dataset is 82.3%. Compared with the FSSD algorithm, some speed is sacrificed, and the accuracy is greatly improved.

    Table and Figures | Reference | Related Articles | Metrics
    Review of Chinese Named Entity Recognition Research
    WANG Yingjie, ZHANG Chengye, BAI Fengbo, WANG Zumin, JI Changqing
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 324-341.   DOI: 10.3778/j.issn.1673-9418.2208028
    With the rapid development of related technologies in the field of natural language processing, as an upstream task of natural language processing, improving the accuracy of named entity recognition is of great significance for subsequent text processing tasks. However, due to the differences between Chinese and English languages, it is difficult to transfer the research results of English named entity recognition into Chinese research effectively. Therefore, the key issues in the current research of Chinese named entity recognition are analyzed from the following four aspects: Firstly, the development of named entity recognition is taken as the main clue, the advantages and disadvantages, common methods and research results of each stage are comprehensively discussed. Secondly, the Chinese text preprocessing methods are summarized from the perspective of sequence annotation, evaluation index, Chinese word segmentation methods and datasets. Then, aiming at the Chinese character and word feature fusion method, the current research is summarized from the perspective of character fusion and word fusion, and the optimization direction of the current Chinese named entity recognition model is discussed. Finally, the practical applications of Chinese named entity recognition in various fields are analyzed. This paper discusses the current research on Chinese named entity recognition, aiming to help researchers understand the research direction and significance of this task more comprehensively, so as to provide a certain reference for proposing new methods and new improvements.
    Reference | Related Articles | Metrics
    Application Research of Improved U-shaped Network in Detection of Retinopathy
    YANG Zhiqiao, ZHANG Ying, WANG Xinjie, ZHANG Dongbo, WANG Yu
    Journal of Frontiers of Computer Science and Technology    2022, 16 (8): 1877-1884.   DOI: 10.3778/j.issn.1673-9418.2012011

    Fundus retinal blood vessel analysis and detection of exudates and bleeding points are important methods for judging the degree of diabetic retinopathy. Aiming at the problems such as poor segmentation effect of bifurcation and end points of microvessels, unclear exudate boundary, difficult segmentation of small and scattered bleeding points, an improved U-shaped network is proposed to extract more rich high-level features by improving the context extraction coding module. And in the feature encoding stage, a hybrid attention mechanism (HAM) is added to highlight the features of microvessels and lesions, and reduce the impact of background and noise. Experimental results show that the segmentation accuracy, sensitivity, specificity and AUC value of the proposed algorithm on the fundus retinal blood vessel segmentation dataset DRIVE are better than U-NET, CE-NET and other existing methods. The sensitivity is increased by 0.0146 compared with CE-Net network. On diabetic retinopathy lesion segmentation dataset DIARETDB1, the segmentation effect of exudates and bleeding points is better than U-NET, CE-NET and other existing methods, which can effectively assist doctors in diagnosis.

    Table and Figures | Reference | Related Articles | Metrics
    COVID-19 Detection Algorithm Combining Grad-CAM and Convolutional Neural Network
    ZHU Bingyu, LIU Zhen, ZHANG Jingxiang
    Journal of Frontiers of Computer Science and Technology    2022, 16 (9): 2108-2120.   DOI: 10.3778/j.issn.1673-9418.2105117

    In the detection of COVID-19, chest X-ray (CXR) images and CT scan images are two main technical methods, which provide an important basis for doctors' diagnosis. Currently, convolutional neural network (CNN) in detecting the COVID-19 medical radioactive images has problems of low accuracy, complex algorithms, and inability to mark feature regions. In order to solve these problems, this paper proposes an algorithm combining Grad-CAM color visualization and convolutional neural network (GCCV-CNN). The algorithm can quickly classify lung X-ray images and CT scan images of COVID-19-positive patients, COVID-19-negative patients, general pneumonia patients and healthy people. At the same time, it can quickly locate the critical area in X-ray images and CT images. Finally, the algorithm can get more accurate detection results through the synthesis of deep learning algorithms. In order to verify the effectiveness of the GCCV-CNN algorithm, experiments are performed on three COVID-19-positive patient datasets and it is compared with existing algorithms. The results show that the classification performance of the algorithm is better than the COVID-Net algorithm and the DeTraC-Net algorithm. The GCCV-CNN algorithm achieves a high accuracy of 98.06%, which is faster and more robust.

    Table and Figures | Reference | Related Articles | Metrics
    Multi-head Self-attention Neural Network for Detecting EEG Epilepsy
    TONG Hang, YANG Yan, JIANG Yongquan
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 442-452.   DOI: 10.3778/j.issn.1673-9418.2104089
    Epilepsy is a life-threatening and challenging nervous system disease. There are still many challenges in the detection of epilepsy based on electroencephalogram (EEG). Because the EEG signal is unstable, different patients show different seizure patterns. In addition, EEG detection is time-consuming and laborious, which will not only bring heavy burden to medical staff, but also easily lead to false detection. Therefore, it is necessary to study an efficient automatic epilepsy detection technology across multiple patients. In this paper, an epileptic EEG detection method (convolutional attention bidirectional long short-term memory network, CABLNet) based on the multi-head self-attention mechanism neural network is proposed. Firstly, the convolution layer is used to capture short-term temporal patterns of EEG time series and local dependence among channels. Secondly, this paper uses the multi-head self-attention mechanism to capture the long-distance dependence and time dynamic correlation of the short-term time pattern feature vectors with temporal relationship. Thirdly, the context representation is sent into a bidirectional long short-term memory (BiLSTM) to extract the information in the front and back directions. Finally, logsoftmax function is used for training and classification. Using CHB-MIT scalp EEG database data, the sensitivity, specificity, accuracy and F1-score are 96.18%, 97.04%, 96.61% and 96.59% respectively. The results show that the proposed method is superior to the existing methods and significantly improved in epilepsy detection performance, which is of great significance to the auxiliary diagnosis of epilepsy.
    Reference | Related Articles | Metrics
    Anonymous Communication and Darknet Space Comprehensive Governance
    LAN Haoliang, LI Fujuan, WANG Qun, YIN Jie, XU Jie, HONG Lei, XUE Yishi, XIA Minghui
    Journal of Frontiers of Computer Science and Technology    2022, 16 (11): 2430-2455.   DOI: 10.3778/j.issn.1673-9418.2204004

    The characteristics of anonymous communication, such as difficulty in node discovery, service positio-ning, communication relationship confirmation, and user monitoring, make the darknet built on it full of various illegal and criminal activities of anonymous abuse. To this end, the academic community has carried out a series of targeted research around anonymous communication and darknet. Accordingly, on the basis of systematically intro-ducing the development history of anonymous communication, anonymous mechanisms and typical systems, this paper focuses on combing, summarizing and inducing related research in this field by combining key technologies of anonymous communication, anonymity measurement, anonymous attack, anonymous enhancement, anonymous communication performance evaluation and improvement and darknet space comprehensive governance. Mean-while, this paper focuses and analyzes the development trend of anonymous communication research in the future and the challenges and countermeasures faced by the darknet space comprehensive governance.

    Table and Figures | Reference | Related Articles | Metrics