Content of Artificial Intelligence·Pattern Recognition in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Temporal Multimodal Sentiment Analysis with Composite Cross Modal Interaction Network
    YANG Li, ZHONG Junhong, ZHANG Yun, SONG Xinyu
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1318-1327.   DOI: 10.3778/j.issn.1673-9418.2311004
    To address the issues of insufficient modal fusion and weak interactivity caused by semantic feature differences between different modalities in multimodal emotion analysis, a temporal multimodal sentiment analysis model for composite cross modal interaction network (CCIN-SA) is constructed by studying and analyzing the potential correlations between different modalities. The model first uses a bidirectional gated loop unit and a multi-head attention mechanism to extract temporal features of text, visual, and speech modalities with contextual semantic information. Then, a cross modal attention interaction layer is designed to continuously strengthen the target mode using low order signals from auxiliary modes, enabling the target mode to learn information from auxiliary modes and capture potential adaptability between modes. Then it inputs the enhanced features into the composite feature fusion layer, further captures the similarity between different modalities through condition vectors, enhances the correlation degree of important features, and mines deeper level interactivity between modalities. Finally, using a multi-head attention mechanism, the composite cross modal enhanced features are concatenated and fused with low order signals to increase the weight of important features within the modality, preserve the unique feature information of the initial modality, and perform the final emotion classification task on the obtained multimodal fused features. The model evaluation is conducted on the CMU-MOSI and CMU-MOSEI datasets, and the results show that the model is improved in accuracy and F1 metrics compared with other existing models. It can be seen that the CCIN-SA model can effectively explore the correlation between different modalities and make more accurate emotional judgments.
    Reference | Related Articles | Metrics
    Abstract61
    PDF72
    Sentiment Analysis Combining Dynamic Gradient and Multi-view Co-attention
    WANG Xiang, MAO Li, CHEN Qidong, SUN Jun
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1328-1338.   DOI: 10.3778/j.issn.1673-9418.2301042
    Aiming at the problems of unbalanced inter-modal optimization and inadequate fusion of multimodal features in multimodal sentiment analysis, a multimodal sentiment analysis model combining dynamic gradient mechanism and multi-view co-attention mechanism (DG-MCM) is proposed, which can effectively mine single-modal representation and fully integrate multimodal information. Firstly, the model uses pre-trained model BERT (bidirectional encoder representation from transformers) and stacked long short-term memory (SLSTM) to learn the features of text, audio and video, and proposes a dynamic gradient mechanism. By monitoring the contribution difference and learning speed of each mode, the feature learning of each mode is assisted. Secondly, the features of different modes obtained are fused using the multi-view co-attention mechanism. By projecting every two modes into multiple spaces for interaction, more adequate fusion features are obtained. Finally, fusion features and single-modal features are spliced together for sentiment prediction. Experimental results on CMU-MOSI and CMU-MOSEI datasets show that this model can fully learn information between single mode and different modes, and effectively improve the accuracy of multimodal sentiment analysis.
    Reference | Related Articles | Metrics
    Abstract115
    PDF65
    Emotional Intensity Response Generation Model
    MA Zhiqiang, ZHOU Yutong, JIA Wenchao, XU Biqi, WANG Chunyu
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1339-1347.   DOI: 10.3778/j.issn.1673-9418.2301045
    Emotional dialogue generation models do not consider the emotional intensity factor in response generation, which leads to the inappropriateness of the emotional expression in generated response, and reduces the user interaction experience. Inspired by the work of emotional intensity in emotional psychology, this paper proposes an emotional intensity response generation model (EIRGM), which includes an emotional intensity prediction unit, a context encoding module and an emotional intensity response generation unit. An emotional intensity prediction unit provides emotion categories and emotional intensity for reply sentences; a context encoding module provides content basis for response; an emotional intensity response generation unit is used to express the emotion and intensity in response. Based on the NLPCC2018 open-domain dialogue dataset, experiments are carried out in terms of emotional appropriateness, emotional intensity appropriateness, content relevance, and dialogue persistence. Experimental results show that EIRGM is not much different from the optimal model in terms of emotional appropriateness, and EIRGM is improved by 4.1 percentage points and 0.8 percentage points compared with the optimal model in terms of emotional intensity appropriateness and dialogue persistence, respectively. It shows that the model improves the emotional intensity appropriateness of emotional expression, and improves the user’s willingness to interact.
    Reference | Related Articles | Metrics
    Abstract64
    PDF46
    Knowledge Graph Completion Algorithm with Multi-view Contrastive Learning
    QIAO Zifeng, QIN Hongchao, HU Jingjing, LI Ronghua, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1001-1009.   DOI: 10.3778/j.issn.1673-9418.2301038
    Knowledge graph completion is a process of reasoning new triples based on existing entities and relations in knowledge graph. The existing methods usually use the encoder-decoder framework. Encoder uses graph convolutional neural network to get the embeddings of entities and relations. Decoder calculates the score of each tail entity according to the embeddings of the entities and relations. The tail entity with the highest score is the inference result. Decoder inferences triples independently, without consideration of graph information. Therefore, this paper proposes a graph completion algorithm based on contrastive learning. This paper adds a multi-view contrastive learning framework into the model to constrain the embedded information at graph level. The comparison of multiple views in the model constructs different distribution spaces for relations. Different distributions of relations fit each other, which is more suitable for completion tasks. Contrastive learning constraints the embedding vectors of entity and subgraph and enhahces peroformance of the task. Experiments are carried out on two datasets. The results show that MRR is improved by 12.6% over method A2N and 0.8% over InteractE on FB15k-237 dataset, and 7.3% over A2N and 4.3% over InteractE on WN18RR dataset. Experimental results demonstrate that this model outperforms other completion methods.
    Reference | Related Articles | Metrics
    Abstract255
    PDF283
    Research on Sentiment Analysis of Short Video Network Public Opinion by Integrating BERT Multi-level Features
    HAN Kun, PAN Hongpeng, LIU Zhongyi
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1010-1020.   DOI: 10.3778/j.issn.1673-9418.2311023
    The era of self-media and the widespread popularity of online social software have led to short video platforms becoming “incubators” easily for the origin and fermentation of public opinion events. Analyzing the public opinion comments on these platforms is crucial for the early warning, handling, and guidance of such incidents. In view of this, this paper proposes a text classification model combining BERT and TextCNN, named BERT-MLFF-TextCNN, which integrates multi-level features from BERT for sentiment analysis of relevant comment data on the Douyin short video platform. Firstly, the BERT pre-trained model is used to encode the input text. Secondly, semantic feature vectors from each encoding layer are extracted and fused. Subsequently, a self-attention mechanism is integrated to highlight key features, thereby effectively utilizing them. Finally, the resulting feature sequence is input into the TextCNN model for classification. The results demonstrate that the BERT-MLFF-TextCNN model outperforms BERT-TextCNN, GloVe-TextCNN, and Word2vec-TextCNN models, achieving an [F1] score of 0.977. This model effectively identifies the emotional tendencies in public opinions on short video platforms. Based on this, using the TextRank algorithm for topic mining allows for the visualization of thematic words related to the sentiment polarity of public opinion comments, providing a decision-making reference for relevant departments in the public opinion management work.
    Reference | Related Articles | Metrics
    Abstract148
    PDF132
    Self-supervised Hybrid Graph Neural Network for Session-Based Recommendation
    ZHANG Yusong, XIA Hongbin, LIU Yuan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1021-1031.   DOI: 10.3778/j.issn.1673-9418.2212043
    Session-based recommendation aims to predict user actions based on anonymous sessions. Most of the existing session recommendation algorithms based on graph neural network (GNN) only extract user preferences for the current session, but ignore the high-order multivariate relationships from other sessions, which affects the recommendation accuracy. Moreover, session-based recommendation suffers more from the problem of data sparsity due to the very limited short-term interactions. To solve the above problems, this paper proposes a model named self-  supervised hybrid graph neural network (SHGN) for session-based recommendation. Firstly, the model describes the relationship between sessions and objects by constructing the original data into three views. Next, a graph attention network is used to capture the low-order transitions information of items within a session, and then a residual graph convolutional network is proposed to mine the high-order transitions information of items and sessions. Finally, self-supervised learning (SSL) is integrated as an auxiliary task. By maximizing the mutual information of session embeddings learnt from different views, data augmentation is performed to improve the recommendation performance. In order to verify the effectiveness of the proposed method, comparative experiments with mainstream baseline models such as SR-GNN, GCE-GNN and DHCN are carried out on four benchmark datasets of Tmall, Diginetica, Nowplaying and Yoochoose, and the results are improved in P@20, MRR@20 and other performance indices.
    Reference | Related Articles | Metrics
    Abstract141
    PDF154
    Policy Search Reinforcement Learning Method in Latent Space
    ZHAO Tingting, WANG Ying, SUN Wei, CHEN Yarui, WANG Yuan, YANG Jucheng
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1032-1046.   DOI: 10.3778/j.issn.1673-9418.2211106
    Policy search is an efficient learning method in the field of deep reinforcement learning (DRL), which is capable of solving large-scale problems with continuous state and action spaces and widely used in real-world problems. However, such method usually requires a large number of trajectory samples and extensive training time, and may suffer from poor generalization ability, making it difficult to generalize the learned policy model to seemingly small changes in the environment. In order to solve the above problems, this paper proposes a policy search DRL method based on latent space. Specifically, this paper extends the idea of state representation learning to action representation learning, i.e. learning a policy in the latent space of action representations, and then mapping the action representations to the real action space. With the introduction of representation learning models, this paper abandons the traditional end-to-end training manner in DRL and divides the whole task into two stages: large-scale representation model learning and the small-scale policy model learning, where unsupervised learning methods are employed to learn the representation models and policy search methods are used to learn the small-scale policy model. Large-scale representation models can ensure the capacity for generalization and expressiveness, while small-scale policy model can reduce the burden of policy learning, thus alleviating the issues of low sample utilization, low learning efficiency and weak generalization of action selection in DRL to some extent. Finally, the effectiveness of introducing the latent state and action representations is demonstrated by the intelligent control task CarRacing and Cheetah.
    Reference | Related Articles | Metrics
    Abstract114
    PDF204
    Potential Relationship Based Joint Entity and Relation Extraction
    PENG Yanfei, ZHANG Ruisi, WANG Ruihua, GUO Jialong
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1047-1056.   DOI: 10.3778/j.issn.1673-9418.2301061
    The role of joint entity and relation extraction is to identify entities and their corresponding relations from specific texts, and it is also the basis for constructing and updating knowledge graph. Currently, joint extraction methods ignore information redundancy in the extraction process while pursuing performance. To address this issue, a model based on latent relations for joint entity and relation extraction is proposed. This paper designs a new decoding method to reduce the redundant information of relationships, entities and triples in the prediction process, and it is divided into two steps: extracting potential entity pairs and decoding relationships to complete the extraction of triples. Firstly, the potential entity pair extractor is used to predict whether there is potential relationship between entities, and at the same time, the entity pairs with high confidence are selected as the final potential entity pairs. Secondly, the relational decoding is regarded as a multi-label binary classification task, and the confidence of all relationships between each potential entity pair is predicted by the relational decoder. Finally, the number and type of relationships are determined by confidence to complete the task of extracting triples. Experimental results on two general datasets show that the proposed model is better than the baseline models in terms of accuracy and F1 indicators, which verifies the effectiveness of the proposed model. The ablation experiment also proves the effectiveness of the internal parts of the model.
    Reference | Related Articles | Metrics
    Abstract84
    PDF69
    Multi-feature Interaction for Aspect Sentiment Triplet Extraction
    CHEN Linying, LIU Jianhua, ZHENG Zhixiong, LIN Jie, XU Ge, SUN Shuihua
    Journal of Frontiers of Computer Science and Technology    2024, 18 (4): 1057-1067.   DOI: 10.3778/j.issn.1673-9418.2302077
    Aspect sentiment triple extraction is one of the subtasks of aspect-level sentiment analysis, which aims to extract aspect terms, corresponding opinion terms and sentiment polarity in sentence. Previous studies focus on designing a new paradigm to complete the triplet extraction task in an end-to-end manner. They ignore the role of external knowledge in the model, thus semantic information, part-of-speech information and local context information are not fully explored and utilized. Aiming at the above problems, multi-feature interaction for aspect sentiment triplet extraction (MFI-ASTE) is proposed. Firstly, the bidirectional encoder representation from transformers (BERT) model is used to learn the context semantic feature information, meanwhile, the self-attention mechanism is used to strengthen the semantic feature. Secondly, the semantic feature interacts with the extracted part-of-speech feature and both learn from each other to strengthen the combination ability of the part-of-speech and semantic information. Thirdly, many convolutional neural networks are used to extract multiple local context features of each word, and then multi-point gate mechanism is used to filter these features. Fourthly, three features of external knowledge are fused by two linear layers. Finally, biaffine attention is used for predicting grid tagging and specific decoding schemes are used for decoding triplets. Experimental results show that the proposed model improves the F1 score by 6.83%, 5.60%, 0.54% and 1.22% respectively on four datasets compared with existing mainstream models.
    Reference | Related Articles | Metrics
    Abstract109
    PDF69
    Low-Resource Machine Translation Based on Training Strategy with Changing Gradient Weight
    WANG Jiaqi, ZHU Junguo, YU Zhengtao
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 731-739.   DOI: 10.3778/j.issn.1673-9418.2211078
    In recent years, neural network models such as Transformer have achieved significant success in machine translation. However, training these models relies on rich labeled data, posing a challenge for low-resource machine translation due to the limited scale of parallel corpora. This limitation often leads to subpar performance and a susceptibility to overfitting on high-frequency vocabulary, thereby reducing the model’s generalization ability on the test set. To alleviate these issues, this paper proposes a strategy of gradient weight modification. Specifically, it suggests multiplying the gradients generated for each new batch by a coefficient on top of the Adam algorithm. This coefficient incrementally increases, aiming to weaken the model’s dependence on high-frequency features during early training while maintaining the rapid convergence advantage of the algorithm in the later stages. This paper also outlines the modified training process, including adjustments and decay of coefficients, to emphasize different aspects at different training stages. The goal of this strategy is to enhance attention to low-frequency vocabulary and prevent the model from overfitting to high-frequency terms. Experimental translation tasks are conducted on three low-resource bilingual datasets, and the proposed method demonstrates improvements of 0.72, 1.37, and 1.04 BLEU scores relative to the baseline model on the respective test set.
    Reference | Related Articles | Metrics
    Abstract94
    PDF93
    Time Series Anomaly Detection Model with Dual Attention Mechanism
    YANG Chaocheng, YAN Xuanhui, CHEN Rongjun, LI Hanzhang
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 740-754.   DOI: 10.3778/j.issn.1673-9418.2304005
    As an important part of time series research, time series anomaly detection has attracted extensive attention and research in academia and industry. In view of the deep local features and complex dependency in time series data, an anomaly detection model with dual attention mechanism is proposed. The model adopts autoencoder structure. The encoder is composed of a squeeze excitation attention block (SEAB) and a probsparse self-attention block (PSAB). SEAB mines local features containing important information by assigning greater weights to sequence segments with strong discriminability using dynamic weighted window partitioning. PSAB adopts sparse self-attention mechanism to retain dot products with higher weights, eliminate redundant timing features, and reduce time complexity, so as to capture the long-term dependence of time series. Experimental results show that the proposed model achieves the highest F1 score of 0.97 among 9 comparison models and outperforms all other comparison models in 8 of 14 tested datasets in terms of F1 score, which can effectively identify abnormal situation in time series data and achieve advanced anomaly detection performance.
    Reference | Related Articles | Metrics
    Abstract113
    PDF95
    Geographically Insensitive Spatial-Temporal POI Recommendation Based on Heterogeneous Graph Embedding
    LI Manwen, ZHANG Yueqin, ZHANG Chenwei, ZHANG Zehua
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 755-767.   DOI: 10.3778/j.issn.1673-9418.2211098
    The increasingly large scale of location-based social networks (LBSN) promotes the rapid development of point-of-interest (POI) recommendation business. POI geospatial distance directly adopted by traditional methods is difficult to simulate the highly random behavior path of users. And the point-of-interest recommendation process brings sensitivity to the location distance measurement. Meanwhile, the sparse POI check-in data of users in social networks are also easy to have a huge impact on the recommendation accuracy. To solve the above issues, geographically insensitive spatio-temporal POI recommendation model based on heterogeneous graph embedding (GIPR) is proposed. Firstly, the user behavior sequence is introduced to construct the spatial and temporal topological diagram of the behavior POI. The weighted spatial path is used to represent the relative location distance. It can not only conform to the characteristics of user behavior, but also reduce the sensitivity of the recommendation process to the distance between POI, thus enhancing the ability to explain the recommendation results. As for heterogeneous and highly sparse interaction data, the proposed recommendation method can learn the complete LBSN heterogeneous graph from local and global perspectives, and integrate richer user and POI features. Finally, the long-term and short-term preferences of users are extracted through the attention layer to achieve more personalized POI recommendation. Experiments on two large-scale real datasets Foursquare and Gowalla show that GIPR has higher recommendation accuracy and stronger interpretability.
    Reference | Related Articles | Metrics
    Abstract92
    PDF33
    Session Recommendation Algorithm Combining Item Transition Relations and Time-Order Information
    WU Wenzheng, LU Xianling
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 768-779.   DOI: 10.3778/j.issn.1673-9418.2211090
    Aiming at the problem that the existing graph neural network session recommendation algorithm ignores all kinds of auxiliary information, which leads to the inability to accurately model the session sequence, a session recommendation algorithm combining the item transition relations and time-order information (RTSR) is proposed. Firstly, the shortest path sequence between any two nodes is obtained by using the graph network structure, which is encoded as the item transition relations between corresponding items through the gated recurrent unit (GRU), and then the global dependency information of the session is captured from the perspective of the graph by combining the self-attention mechanism. At the same time, a lossless graph coding scheme is designed to alleviate the problem of information loss in the process of session graph coding. The scheme quantifies the time-order information in the session sequence reasonably, and takes it as the weight of the edges in the session graph, and then combines the gated graph sequence neural network to obtain the local dependency information of the session. Finally, with linear combination of global dependency information and local dependency information, and in combination with reverse position information,  the user??s preference for item is finally generated, and the recommendation list is given. The performance comparison experiment with mainstream models such as SR-GNN, GC-SAN and GCE-GNN on the public benchmark datasets Gowalla and Diginetica shows that RTSR  improves at least 6.13% and 1.58% in average reciprocal ranking respectively, and the recommendation accuracy is also improved accordingly.
    Reference | Related Articles | Metrics
    Abstract109
    PDF129
    Named Entity Recognition Model Based on k-best Viterbi Decoupling Knowledge Distillation
    ZHAO Honglei, TANG Huanling, ZHANG Yu, SUN Xueyuan, LU Mingyu
    Journal of Frontiers of Computer Science and Technology    2024, 18 (3): 780-794.   DOI: 10.3778/j.issn.1673-9418.2211052
    Knowledge distillation is a general approach to improve the performance of the named entity recognition (NER) models. However, the classical knowledge distillation loss functions are coupled, which leads to poor logit distillation. In order to decouple and effectively improve the performance of logit distillation, this paper proposes an approach, k-best Viterbi decoupling knowledge distillation (kvDKD), which combines k-best Viterbi decoding to improve the computational efficiency, effectively improving the model performance. Additionally, the NER based on deep learning is easy to introduce noise in data augmentation. Therefore, a data augmentation method combining data filtering and entity rebalancing algorithm is proposed, aiming to reduce noise introduced by the original dataset and to enhance the problem of mislabeled data, which can improve the quality of data and reduce overfitting. Based on the above method, a novel named entity recognition model NER-kvDKD (named entity recognition model based on k-best Viterbi decoupling knowledge distillation) is proposed. The comparative experimental results on the datasets of MSRA, Resume, Weibo, CLUENER and CoNLL-2003 show that the proposed method can improve the generalization ability of the model and also effectively improves the student model performance.
    Reference | Related Articles | Metrics
    Abstract221
    PDF162
    Class Incremental Learning Method Integrating Balance Weight and Self-supervision
    GONG Jiayi, XU Xinlei, XIAO Ting, WANG Zhe
    Journal of Frontiers of Computer Science and Technology    2024, 18 (2): 477-485.   DOI: 10.3778/j.issn.1673-9418.2212055
    In view of the catastrophic forgetting phenomenon of knowledge in class incremental learning in image classification, the existing class incremental learning methods focus on the correction of the unbalanced offset of the model classification layer, ignoring the offset of the model feature layer, and fail to solve the problem of the imbalance between the new and old samples faced by class incremental learning. Therefore, a new class incremental learning method is proposed, which is called balance weight and self-supervision (BWSS). BWSS designs an adaptive balance weight based on the low expectation of the old class in training, so as to expand the loss return proportion of the old class in the same data batch to correct the overall model offset. Then, BWSS introduces self-supervised learning to predict the rotation angle of the sample as an auxiliary task, so as to make the model have the expression ability of redundant features and common features to better support incremental tasks. Through the experimental comparison with the mainstream incremental class learning algorithms on the open datasets CIFAR-10 and CIFAR-100, it is proven that BWSS not only has better incremental performance on CIFAR-10 with fewer categories and more samples, but also has advantages on CIFAR-100 with more categories and fewer samples. Ablation experiments and feature visualization demonstrate that the proposed method is effective for the feature representation and incremental performance of the model. The final accuracy of BWSS’s 5-stage incremental task on CIFAR-10 reaches 76.9%, which is 5 percentage points higher than the baseline method.
    Reference | Related Articles | Metrics
    Abstract84
    PDF88
    Knowledge Graph Link Prediction Fusing Description and Structural Features
    CHEN Jiaxing, HU Zhiwei, LI Ru, HAN Xiaoqi, LU Jiang, YAN Zhichao
    Journal of Frontiers of Computer Science and Technology    2024, 18 (2): 486-495.   DOI: 10.3778/j.issn.1673-9418.2211011
    Knowledge graph generally has the problem of incomplete knowledge, which makes link prediction an important research content of knowledge graph. Existing models only focus on the embedding representation of triples. On the one hand, in terms of model input, only the embedding representation of entities and relations is randomly initialized, and the description information of entities and relations is not incorporated, which will lack semantic information; on the other hand, in decoding, the influence of the structural features of the triplet itself on the link prediction results is ignored. Aiming at the above problems, this paper proposes a knowledge graph link prediction model BFGAT (graph attention network link prediction based on fusion of description information and structural features) that integrates description information and structural features. The BFGAT model uses the BERT pretraining model to encode the description information of entities and relations, and integrates the description information into the embedding representation of entities and relations to solve the problem of missing semantic information. In the coding process, graph attention mechanism is used to aggregate the information of adjacent nodes to solve the problem that the target node can obtain more information. The embedding representation of triples is spliced into a matrix in the decoding process, using a method based on CNN convolution pooling to solve the problem of triple structural features. The model is subjected to detailed experiments on the public datasets FB15k-237 and WN18RR, and the experiments show that the BFGAT model can effectively improve the effect of knowledge graph link prediction.
    Reference | Related Articles | Metrics
    Abstract172
    PDF102
    Ensemble Feature Selection Method with Fast Transfer Model
    NING Baobin, WANG Shitong
    Journal of Frontiers of Computer Science and Technology    2024, 18 (2): 496-505.   DOI: 10.3778/j.issn.1673-9418.2211073
    Compared with the traditional ensemble feature selection methods, the recently-developed ensemble feature selection with block-regularized [m×2] cross-validation (EFSBCV) not only has a variance of the estimator smaller than that of random [m×2] cross-validation, but also enhances the selection probability of important features and reduces the selection probability of noise features. However, the adopted linear regression model without the use of the bias term in EFSBCV may easily lead to underfitting. Moreover, EFSBCV does not consider the importance of each feature subset. Aiming at these two problems, an ensemble feature selection method called EFSFT (ensemble feature selection method using fast transfer model) is proposed in this paper. The basic idea is that the base feature selector in EFSBCV adopts the fast transfer model in this paper, so as to introduce the bias term. EFSFT transfers 2m subsets of features as the source knowledge, and then recalculates the weight of each feature subset, and the linear model fitting ability with the addition of bias terms is better. The results on real datasets show that compared with EFSBCV, the average FP value by EFSFT reduces up to 58%, proving that EFSFT has more advantages in removing noise features. In contrast to least-squares support vector machine (LSSVM), the average TP value by EFSFT increases up to 5%, which clearly indicates the superiority of EFSFT over LSSVM in choosing important features.
    Reference | Related Articles | Metrics
    Abstract90
    PDF68
    Named Entity Recognition Based on Multi-scale Attention
    TANG Ruixue, QIN Yongbin, CHEN Yanping
    Journal of Frontiers of Computer Science and Technology    2024, 18 (2): 506-515.   DOI: 10.3778/j.issn.1673-9418.2210078
    The accuracy of named entity recognition (NER) task will promote the research of multiple downstream tasks in natural language field. Due to a large number of nested semantics in text, named entities are recognized difficultly. Recognizing nested semantics becomes a difficulty in natural language processing. Previous studies have single scale of extracting feature and under-utilization of the boundary information. They ignore many details under different scales and then lead to the situation of entity recognition error or omission. Aiming at the above problems, a multi-scale attention method for named entity recognition (MSA-NER) is proposed. Firstly, the BERT model is used to obtain representation vector containing context information, and then the BiLSTM network is used to strengthen the context representation of text. Secondly, the representation vectors are enumerated and concatenated to form span information matrix. The direction information is fused to obtain richer interactive information. Thirdly, multi-head attention is used to construct multiple subspaces. Two-dimensional convolution is used to optionally aggregate text information at different scales in each subspace, so as to implement multi-scale feature fusion in each attention layer. Finally, the fused matrix is used for span classification to identify named entities. Experimental results show that the [F1] score of the proposed method reaches 81.7% and 86.8% on GENIA and ACE2005 English datasets, respectively. The proposed method demonstrates better recognition performance compared with existing mainstream models.
    Reference | Related Articles | Metrics
    Abstract202
    PDF178
    Knowledge Graph-Based Video Classification Algorithm for Film and Television Drama
    JIANG Hongxun, ZHANG Lin, SUN Caihong
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 161-174.   DOI: 10.3778/j.issn.1673-9418.2209078
    Based on the diversity of video perception modalities, a complete video tagging hierarchy classification algorithm combines visual and textual modalities to train a joint model to infer video content. However, most of the existing studies are only applicable to coarse-grained classification. Classification for film and television drama requires more fine-grained identification. This study proposes a knowledge graph-based video classification algorithm. Firstly, the algorithm extracts visual and textual features using a multimodal pre-training model, which is trained on large-scale generic data. A multi-task video label prediction model is further trained to obtain a total of three-level labels for the video: content labels, theme labels and entity labels. The difficulty of training the classification model is improved by introducing a similarity task into the multi-task network. The similarity task provides a tighter fit of similar samples, while the learned characteristics better express sample differences. Secondly, for entity labels, an entity correction model with local attention head is proposed. It can fuse, de-duplicate or extend the prediction results by introducing co-occurrence information from the knowledge graph, and produce a more accurate entity label prediction result. Based on semi-structured data retrieved from Douban, this paper constructs a film and television knowledge graph and conducts an empirical study of the video tag classification model for film and television. Experimental results show that, firstly, the cross-entropy loss function and the loss function of similarity task impose a common constraint on training the classification model, which serves to optimize the feature representation. Top-1 accuracy is improved by 3.70%, 3.35% and 16.57% for content labels, theme labels and entity labels respectively. Secondly, entity correction model with global/local attention heads improves the Top-1 accuracy of entity labels from 38.7% to 45.6% after the introduction of knowledge graph information. The proposed research is a new attempt on the multimodal video classification using image-text pair data, providing a new research idea for short video classification in the case of a small number of data samples.
    Reference | Related Articles | Metrics
    Abstract201
    PDF166
    Time-Aware Sequential Recommendation Model Based on Dual-Tower Self-Attention
    YU Wenting, WU Yun
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 175-188.   DOI: 10.3778/j.issn.1673-9418.2211022
    Users’ preferences are migratory and aggregated. Although recommenders have been greatly improved by modeling the timestamps of interactions within a sequential modeling framework, they only consider the time interval of interactions when modeling, making them limited in capturing the temporal dynamics of user prefer-ences. For this reason, this paper proposes a novel time-aware positional embedding that fuses temporal information into the positional embedding to help the network learn item correlations at the temporal level. Then, based on the time-aware positional embedding, this paper proposes a time-aware sequential recommendation model based on dual-tower self-attention (TiDSA). TiDSA includes item-level and feature level self-attention blocks, which analyzes the process of user preference change over time from the perspective of items and features respectively, and achieves the unified modeling of time, items and features. In addition, in the feature-level self-attention block, this paper calculates the self-attention weights from three dimensions, namely, feature-feature, item-item and item-feature, to fully capture the correlation between different features. Finally, the model fuses the item-level and feature-level information to obtain the final user preference representation and provides reliable recommendation results for users. Experimental results on four real-world datasets show that TiDSA outperforms various state-of-the-art models.
    Reference | Related Articles | Metrics
    Abstract183
    PDF151
    Knowledge Concept Recommendation Model for MOOCs with Local Subgraph Embedding
    JU Chengcheng, ZHU Yi
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 189-204.   DOI: 10.3778/j.issn.1673-9418.2209056
    Massive online open courses (MOOCs) have been extensively researched in reducing user learning blindness and improving user experience, especially personalized course resource recommendation based on graph neural networks. However, these efforts focus primarily on fixed or homogeneous graphs, vulnerable to data sparsity problems, and difficult to scale. This paper  uses graph convolution on local subgraphs combined with an extended matrix factorization (MF) model to overcome this limitation. Firstly, the proposed method decomposes the heterogeneous graph into multiple meta-path-based subgraphs and combines random wandering sampling methods to capture complex semantic relationships between entities while sampling nodes’ influential neighborhoods, and performs graph convolution on local neighborhoods to smooth the representation of each node and achieve high scalability. Next, the attention mechanism adaptively fuses the contextual information of different subgraphs for a more comprehensive construction of user preferences. Finally, the model parameters are optimized by expanding MF to obtain recommendation list. To validate the performance of the proposed model, comparative experiments are conducted on publicly available MOOCs datasets, with a 2% performance improvement and a nearly 500% reduction in memory computation requirements compared with the optimal baseline, providing strong scalability while alleviating the data sparsity problem.
    Reference | Related Articles | Metrics
    Abstract125
    PDF116
    Dual Features Local-Global Attention Model with BERT for Aspect Sentiment Analysis
    LI Jin, XIA Hongbin, LIU Yuan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 205-216.   DOI: 10.3778/j.issn.1673-9418.2210012
    Aspect-based sentiment analysis aims to predict the sentiment polarity of a specific aspect in a sentence or document. Most of recent research uses attention mechanism to model the context. But there is a problem in that the context information needs to be considered according to different contexts when the BERT model is used to calculate the dependencies between representations to extract features by sentiment classification models, which leads to the lack of contextual knowledge of the modelled features. And the importance of aspect words is not given more attention, affecting the overall classification performance of the model. To address the problems above, this paper proposes a dual features local-global attention model with BERT (DFLGA-BERT). Local and global feature extraction modules are designed respectively to fully capture the semantic association between aspect words and context. Moreover, an improved quasi-attention mechanism is used in DFLGA-BERT, which leads to the model using minus attention in the fusion of attention to weaken the effect of noise on classification in the text. The feature fusion structure of local and global features is designed to better integrate regional and global features based on conditional layer normalization (CLN). Experiments are conducted on the SentiHood and SemEval 2014 Task 4 datasets. Experimental results show that the performance of the proposed model is significantly improved compared with the baselines after incorporating contextual features.
    Reference | Related Articles | Metrics
    Abstract245
    PDF231
    Affection Enhanced Dual Graph Convolution Network for Aspect Based Sentiment Analysis
    ZHANG Wenxuan, YIN Yanjun, ZHI Min
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 217-230.   DOI: 10.3778/j.issn.1673-9418.2209033
    Aspect-based sentiment analysis is a fine-grained sentiment classification task. In recent years, graph neural network on dependency tree has been used to model the dependency relationship between aspect terms and their opinion terms. However, such methods usually have the disadvantage of highly dependent on the quality of dependency parsing. Furthermore, most existing works focus on syntactic information, while ignoring the effect of affective knowledge in modeling the sentiment-related dependencies between specific aspects and context. In order to solve these problems, an affection enhanced dual graph convolution network is designed and proposed for aspect-based sentiment analysis. The model establishes a dual channel structure based on the dependency tree and attention mechanism, which can more accurately and efficiently capture the syntactic and semantic dependencies between aspects and contexts, and reduce the dependence of the model on the dependency tree. In addition, affective knowledge is integrated to enhance the graph structure and help the model better extract the sentiment-related dependencies of specific aspects. The accuracy of the model on the three open benchmark datasets Rest14, Lap14 and Twitter reaches 84.32%, 78.20% and 76.12% respectively, approaching or exceeding the state-of-the-art perfor-mance. Experiments show that the method proposed can make rational use of semantic and syntactic information, and achieves advanced sentiment classification performance with fewer parameters.
    Reference | Related Articles | Metrics
    Abstract205
    PDF126
    Integrating Behavioral Dependencies into Multi-task Learning for Personalized Recommendations
    GU Junhua, LI Ningning, WANG Xinxin, ZHANG Suqi
    Journal of Frontiers of Computer Science and Technology    2024, 18 (1): 231-243.   DOI: 10.3778/j.issn.1673-9418.2208098
    The introduction of multiple types of behavioral data alleviates the data sparsity and cold-start problems of collaborative filtering algorithms, which is widely studied and applied in the field of recommendations. Although great progress has been made in the current research on multi-behavior recommendation, the following problems still exist: failure to comprehensively capture the complex dependencies between behaviors; ignoring the relevance of behavior features to users and items, and the recommendation results are biased. This results in the learned feature vectors failing to accurately represent the user??s interest preferences. To solve the above problems, a person-alized recommendation model (BDMR) that integrates behavioral dependencies into multi-task learning is proposed, and in this paper, the complex dependencies between behaviors are divided into feature relevance and temporal relevance. Firstly, the user personalized behavior vector is set, and multiple interaction graphs are processed with graph neural networks which combine user, item and behavior features to aggregate higher-order neighborhood information, and attention mechanism is combined to learn feature relevance among behaviors. Secondly, the interaction sequence composed of behavior features and item features is input into a long and short-term memory network to capture the temporal relevance among behaviors. Finally, personalized behavior vectors are integrated into a multi-task learning framework to obtain more accurate user, behavior and item features. To verify the perf-ormance of this model, experiments are conducted on three real datasets. On the Yelp dataset, compared with the optimal baseline, HR and NDCG are improved by 1.5% and 2.9% respectively. On the ML20M dataset, HR and NDCG are increased by 2.0% and 0.5% respectively. On the Tmall dataset, HR and NDCG are improved by 25.6% and 30.2% respectively. Experimental results show that the model proposed in this paper is superior to baselines.
    Reference | Related Articles | Metrics
    Abstract143
    PDF137
    Noisy Knowledge Graph Representation Learning: a Rule-Enhanced Method
    SHAO Tianyang, XIAO Weidong, ZHAO Xiang
    Journal of Frontiers of Computer Science and Technology    2023, 17 (12): 2999-3009.   DOI: 10.3778/j.issn.1673-9418.2208105
    Knowledge graphs are used to store structured facts, which are presented in the form of triples, i.e., (head entity, relation, tail entity). Current large-scale knowledge graphs are usually constructed with (semi-) automated methods for knowledge extraction and the process inevitably introduces noise, which may affect the effectiveness of the knowledge representation. However, most traditional representation learning methods assume that the triples in knowledge graphs are correct and represent knowledge in a distributed manner accordingly. Therefore, noise detection on knowledge graphs is a crucial task. In addition, the incompleteness of knowledge graphs has also attracted people’s attention. The above problems are studied and a knowledge representation learning framework combining logical rules and relation path information is proposed, which accomplishes knowledge representation learning and achieves a mutual enhancement effect while detecting possible noise. Specifically, the framework is divided into a triple embedding part and a triple trustworthiness estimation part. In the triple embedding part, relation path information and logical rule information are introduced to construct a better knowledge representation based on the triple structure information, the latter of which is used to enhance the ability of relation path reasoning and the interpretability of the representation learning. In the triple trustworthiness estimation part, three types of information are further utilized to detect possible noise. Experiments are conducted on three public evaluated datasets and the results show that the model achieves significant performance improvement in tasks such as knowledge graph noise detection and knowledge complementation compared with all baseline methods.
    Reference | Related Articles | Metrics
    Abstract177
    PDF241
    Span-Level Dual-Encoder Model for Aspect Sentiment Triplet Extraction
    ZHANG Yunqi, LI Songda, LAN Yuquan, LI Dongxu, ZHAO Hui
    Journal of Frontiers of Computer Science and Technology    2023, 17 (12): 3010-3019.   DOI: 10.3778/j.issn.1673-9418.2208102
    Aspect sentiment triplet extraction (ASTE)  is one of the subtasks of aspect-based sentiment analysis, which aims to identify all aspect terms, their corresponding opinion terms and sentiment polarities in sentences. Currently, pipeline or end-to-end models are adopted to accomplish the ASTE task. The former cannot solve the overlapping problem of aspect terms in triplets and ignores the dependency between opinion terms and sentiment polarities. The latter divides the ASTE task into two subtasks of aspect-opinion-extraction and sentiment-polarity-classification, which applies multi-task learning through a shared encoder. However, this setting does not distinguish the differences between the features of the two subtasks, leading to the feature confusion problem. SD-ASTE (span-level dual-encoder model for ASTE), a pipeline model with two modules, is proposed to address the above problems. The first module extracts aspect terms and opinion terms based on spans. The span feature representation incor-porates span head, tail and length information to focus on the boundary information of aspect terms and opinion terms. The second module judges the sentiment polarities expressed by aspect-opinion span pairs. The span-pair feature representation is based on levitated markers to focus on the dependency among triplet elements. The model utilizes two independent encoders to extract different features for each module. Comparative experimental results on multiple datasets show that the model is superior to the state-of-the-art pipeline and end-to-end models. Validity experiments show the effectiveness of the span feature representation, span-pair feature representation and the two independent encoders.
    Reference | Related Articles | Metrics
    Abstract185
    PDF115
    Robust Sentiment Analysis Model Based on Feature Representation in Uncertainty Domain
    CHEN Jie, LI Shuai, ZHAO Shu, ZHANG Yanping
    Journal of Frontiers of Computer Science and Technology    2023, 17 (12): 3020-3028.   DOI: 10.3778/j.issn.1673-9418.2305077
    In the sentiment classification of text data, there are often some fuzzy data that are difficult to classify. Due to their uncertainty, these fuzzy data appear to be over fitted during model training, which affects the robustness of the model. The three-way decision theories divide the initial sample into deterministic domains and uncertain domains, and how to select appropriate features for representation in the uncertain domain where the fuzzy data is located for downstream tasks is the challenge of the three-way decision sentiment analysis models. To address this challenge, a robust sentiment analysis model (UFR-SA) based on feature representation of three-way decision uncertainty domains is proposed. Firstly, based on the three-way decision theory, the deterministic domain and the uncertain domain are divided. For fuzzy samples in the uncertain domain, heterogeneous sample point pairs are defined to construct hierarchical features. Secondly, a hierarchical feature fusion model is designed to incorporate the advantages of each granularity feature into a multi-layer perceptual network. Finally, a divide and conquer strategy is adopted for test samples in the deterministic domain and the uncertain domain. The deterministic domain data are represented by the original features, and the fuzzy data in the uncertain domain are represented by the fused robust features.  Experimental results on SST-2, SST-5, and CR datasets show that UFR-SA effectively reduces the interference of fuzzy data on the model and outperforms the performance of state-of-the-art models.
    Reference | Related Articles | Metrics
    Abstract69
    PDF80
    Class-Balanced Modulation for Facial Expression Recognition
    LIU Chengguang, WANG Shanmin, LIU Qingshan
    Journal of Frontiers of Computer Science and Technology    2023, 17 (12): 3029-3038.   DOI: 10.3778/j.issn.1673-9418.2210079
    Facial expression recognition (FER) aims at determining the types of facial expressions for given facial images, which has a broad application prospect in psychological diagnosis, human-computer interaction, etc. In practical tasks, various databases tend to have imbalanced data distributions among basic facial expressions. Such an issue has caused imbalanced feature distribution and inconsistent classifier optimization for various facial expressions, seriously affecting the performance of expression recognition models. To solve this issue, this paper proposes a class-balanced modulation mechanism for facial expression recognition (CBM-Net), which attempts to address the imbalanced data distribution problem by modulating the FER model in feature learning and classifier optimization stages. CBM-Net includes two modules of feature modulation and gradient modulation. The feature modulation module struggles to balance feature distributions for all facial expressions by increasing the separability between classes and the tightness within classes in the feature direction. The gradient modulation module uses the statistical information of batch training samples to reversely adjust the optimization gradient of each classifier to ensure that the convergence speed of each classifier is consistent, so that the performance of each classifier can be optimal at the same time. Qualitative and quantitative experiments on four popular datasets show that CBM-Net is effective in class-balanced modulation, and its effect is quite good compared with many advanced methods.
    Reference | Related Articles | Metrics
    Abstract127
    PDF102
    Multiscale Global Adaptive Attention Graph Neural Network
    GOU Ruru, YANG Wenzhu, LUO Zifei, YUAN Yunfeng
    Journal of Frontiers of Computer Science and Technology    2023, 17 (12): 3039-3051.   DOI: 10.3778/j.issn.1673-9418.2208039
    Dynamic multiscale graph neural networks have high motion prediction errors due to the low correlation between the internal joints of body parts and the limited perceptual fields. A multiscale global adaptive attention graph neural network for human motion prediction is proposed to reduce motion prediction errors. Firstly, a multi-distance partitioning strategy for dividing skeleton joint is proposed to improve the degree of temporal and spatial correlation of body joint information. Secondly, a global adaptive attention spatial temporal graph convolutional network is designed to dynamically enhance the network??s attention to the spatial temporal joints contributing to a motion in combination with global adaptive attention. Finally, this paper integrates the above two improvements into the graph convolutional neural network gate recurrent unit to enhance the state propagation performance of the decoding network and reduce prediction errors. Experimental results show that the prediction error of the proposed method is decreased on Human 3.6M dataset, CMU Mocap dataset and 3DPW dataset compared with state-of-the-art methods.
    Reference | Related Articles | Metrics
    Abstract147
    PDF163
    Multi-teacher Contrastive Knowledge Inversion for Data-Free Distillation
    LIN Zhenyuan, LIN Shaohui, YAO Yiwu, HE Gaoqi, WANG Changbo, MA Lizhuang
    Journal of Frontiers of Computer Science and Technology    2023, 17 (11): 2721-2733.   DOI: 10.3778/j.issn.1673-9418.2204107
    Knowledge distillation is an effective method for model compression with access to training data. However, due to privacy, confidentiality, or transmission limitations, people cannot get the support of data. Existing data-free knowledge distillation methods only use biased feature statistics contained in one model and run into pro-blems with low generalizability and diversity in synthetic images and unsatisfactory student model performance. To address these problems, this paper proposes a multi-teacher contrastive knowledge inversion (MTCKI) method that extracts and fuses model-specific knowledge from the available teacher models into a student model to eliminate model bias. Further, this paper improves the diversity of synthesized images using contrastive learning, which encourages the synthetic images to be distinguishable from the previously stored images. Meanwhile, this paper proposes the strategy of contrastive loss based on multi-teacher and student to improve the feature representation ability of student network. Experiments demonstrate that MTCKI not only can generate visually satisfactory images but also outperforms existing state-of-the-art approaches. The resulting synthesized images are much closer to the distribution of the original dataset and can be generated only once to provide comprehensive guidance for various networks rather than a specific one.
    Reference | Related Articles | Metrics
    Abstract261
    PDF165
    Link Prediction in Knowledge Hypergraph Combining Attention and Convolution Network
    PANG Jun, XU Hao, QIN Hongchao, LIN Xiaoli, LIU Xiaoqi, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (11): 2734-2742.   DOI: 10.3778/j.issn.1673-9418.2208071
    Knowledge hypergraphs (KHG) are knowledge graph of hypergraph structure. KHG link prediction aims to predict the missing relations through the known entities and relations. However, HypE, the existing optimal KHG link prediction method based on embedding model considers the location information when embedding entities, but ignores the differences in the contributions of different entities when embedding relations. And the information of the entity convolution vector is insufficient. Relation embeddings consider the entity contribution and supply the information of entity embedding, which can greatly improve the prediction ability of model. Therefore, link prediction based on attention and convolution network (LPACN) is proposed. The improved attention mechanism is applied to merging entity attention information into relation embeddings. And the number information of neighboring entities in the same tuple is integrated into the convolution network, which further supplies the information of entity convolution embedding. For the gradient vanishing problem of LPACN, the improved ResidualNet is integrated into LPACN, and the multilayer perceptron (MLP) is used to improve the nonlinear learning ability of model. The improved algorithm LPACN+ is obtained. Extensive experiments on real datasets validate that LPACN is better than the Baselines.
    Reference | Related Articles | Metrics
    Abstract198
    PDF110
    Knowledge Graph Inference Method Combined with Decision Implication
    ZHAI Yanhui, HE Xu, LI Deyu, ZHANG Chao
    Journal of Frontiers of Computer Science and Technology    2023, 17 (11): 2743-2754.   DOI: 10.3778/j.issn.1673-9418.2207085
    Decision implication is a tool of decision knowledge representation and reasoning in formal concept analysis. This paper proposes a relationship completion method for knowledge graph based on decision implication. Firstly, this paper constructs the corresponding decision context for a knowledge graph and proves that decision implications are able to equivalently represent the rules in knowledge graph inference. In order to efficiently extract decision implications, this paper reduces the complicated decision contexts many times and proves that the reduced decision contexts also contain the rules in knowledge graph inference. This paper also designs an algorithm to extract decision  implications from the reduced decision contexts and provides steps to perform relationship completion by applying decision implications. Finally, experiments verify the effectiveness of the proposed method. This paper provides a new idea for completing knowledge graph relationship, as well as a new choice for fusion inference.
    Reference | Related Articles | Metrics
    Abstract145
    PDF137
    Path Planning Fusion Algorithm for Indoor Robot Based on Feature Map
    LIU Peng, REN Gongchang
    Journal of Frontiers of Computer Science and Technology    2023, 17 (11): 2755-2766.   DOI: 10.3778/j.issn.1673-9418.2207001
    In order to utilize the advantage of the feature map in calculating efficiency and solve the problem that the traditional dynamic window approach is sensitive to global parameters, a path planning fusion algorithm based on feature map is proposed. A feature map expression applicable to path planning is given, and the detection of obstacles in the feature map is achieved by improving the calculation method of the distance between the robot and the obstacles. Combined with the basic principle of the Bug algorithm and the properties of line segment features, the searching and optimization algorithm is used to search the global feasible path first, and then the key nodes of the global optimal path are obtained by node optimization, and solutions are proposed for the problems of search direction selection at internal and external corner points and obstacle endpoint bypassing. To address the problem of high sensitivity of the traditional dynamic window approach to global parameters, the degree of influence of the parameters in the objective function on the planned path when the robot is at different positions is analyzed, and the original objective function is improved using the dynamic parameter approach. When the algorithms are fused, the calculation method of direction function in the objective function is improved in order to solve the problem that the robot may slow down in the intermediate nodes of the path. The simulation experiment verifies that the searching optimization algorithm is effective, the improved dynamic window approach reduces the sensitivity of parameters, and the fusion algorithm has a greater advantage in computational efficiency, with a maximum reduction of 79.27% and a minimum reduction of 43.16% in computational time consumption, and the robot moves more smoothly.
    Reference | Related Articles | Metrics
    Abstract107
    PDF87
    Improved Ramp-Based Twin Support Vector Clustering
    CHEN Sugen, LIU Yufei
    Journal of Frontiers of Computer Science and Technology    2023, 17 (11): 2767-2776.   DOI: 10.3778/j.issn.1673-9418.2206039
    Twin support vector clustering based on Hinge loss and twin support vector clustering based on Ramp loss are two new twin support vector clustering algorithms, which provide a new research idea for solving the clustering problem, and gradually become a research hotspot in pattern recognition and other fields. However, they often have poor performance when dealing with the clustering problem with noisy data. To solve this problem, in this paper, an asymmetric Ramp loss function is constructed and then an improved Ramp-based twin support vector clustering algorithm is also proposed. The asymmetric Ramp loss function not only inherits the advantages of the Ramp loss function, but also uses asymmetric bounded functions to measure the within-cluster and between-cluster scatters, which makes the algorithm more robust to data points far from the clustering center plane. The introduction of parameter t makes the asymmetric Ramp loss function more flexible. In particular, when t is equal to 1, the asymmetric Ramp loss function degenerates into Ramp loss function, such that the Ramp-based twin support vector clustering becomes a special case of proposed algorithm. In addition, its nonlinear clustering formation is also proposed via kernel trick. The non-convex optimization problems in linear and nonlinear models are solved effectively through the alternating iterative algorithm. Experiments are carried out on several benchmark UCI datasets and artificial datasets, and the experimental results verify the effectiveness of the proposed algorithm.
    Reference | Related Articles | Metrics
    Abstract133
    PDF89
    Dynamic POI Group Recommendation Based on Multi-dimensional User Preference Model
    SUN Mingyang, MA Yuliang, YUAN Ye, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2478-2487.   DOI: 10.3778/j.issn.1673-9418.2207107
    With the massive quantification of networked data and the development of geo-social networks (GSNs), group activities are prevalent in people’s life. The objects of recommendation systems are extended from individ-uals to user groups. Point-of-interest (POI) group recommendation problem is also gradually known as a hot research topic. However, the traditional methods are not suitable for group recommendation in geographic social networks, due to the multifactorial influence of user preferences in GSNs and the complexity of the group decision-making process. To reveal user preferences and the effect of the group decision process on group recommendation, this paper proposes a neural network-based model for dynamic POI group recommendation by leveraging multi-dimensional user preference. Firstly, the proposed model combines temporal and spatial factors to calculate user preferences based on user behavior activity records and builds a group-point-of-interest perception graph with group as unit. Next, this paper adds the influence of collaborative users to model group preferences, which fully considers the characteristics of GSNs, to ensure the accuracy of POI group recommendation. Finally, a neural network-based model can be constructed to simulate group decision-making, which can ensure the accuracy of POI recommen-dations. This paper conducts extensive experiments by comparing the existing group recommendation algorithms on the real datasets to demonstrate the performance of the method proposed in this paper. Experimental results show that the proposed method is significantly better than the existing algorithms in terms of the hit rate of POI, which proves the effectiveness of the proposed algorithm.
    Reference | Related Articles | Metrics
    Abstract125
    PDF143
    Aspect-Level Sentiment Analysis Combining Part-of-Speech and External Knowledge
    GU Yuying, GAO Meifeng
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2488-2498.   DOI: 10.3778/j.issn.1673-9418.2207077
    The goal of aspect level affective analysis is to identify the affective polarity of specific aspect words in a given sentence. At present, most of the research combining graph convolution neural network and syntactic dependency tree focuses on learning the relationship between context and aspect words according to the sentence dependency tree, but does not focus on the construction of syntactic dependency tree, so it can’t make full use of the information on the dependency tree, and will introduce noise. To solve the above problems, this paper proposes a graph convolution network model based on multi-fusion adjacency matrix algorithm. Firstly, external knowledge is used to enhance the role of emotional words in sentences, and the part-of-speech is used for information filtering to remove redundant dependencies in sentences to obtain pruned syntactic dependency trees. The two are combined by multi-fusion adjacency matrix algorithm to obtain syntactic information. The syntactic information and the semantic information extracted from the BiLSTM layer are input into the simplified graph convolution network for feature fusion. Experimental results on five datasets show that the proposed method is effective and can significantly improve the performance of the model.
    Reference | Related Articles | Metrics
    Abstract137
    PDF130
    SMViT: Lightweight Siamese Masked Vision Transformer Model for Diagnosis of COVID-19
    MA Ziping, TAN Lidao, MA Jinlin, CHEN Yong
    Journal of Frontiers of Computer Science and Technology    2023, 17 (10): 2499-2510.   DOI: 10.3778/j.issn.1673-9418.2210070
    In order to solve the problems of low accuracy, poor generalization ability and large number of parameters in the diagnosis model of COVID-19 based on deep learning, a lightweight siamese architecture network SMViT (siamese masked vision transformer) for COVID-19 diagnosis based on ViT (vision transformer) and siamese network is proposed. Firstly, a lightweight strategy of cyclic substructure is proposed, which uses multiple subnets with the same structure to make a diagnosis network, thereby reducing the number of network parameters. Secondly, masked self-supervised pre-training model based on ViT is proposed to enhance the potential feature expression ability of the model. Then, in order to effectively improve the diagnostic accuracy of the diagnosis model of COVID-19, and improve the poor generalization ability of the model under small samples, this paper constructs the twin network SMViT. Finally, the ablation experiment is used to verify and determine the structure of the model, and the diagnostic performance and lightweight capacity of the model are verified through comparative experiments. Experimental results show that, compared with the most competitive ViT-based diagnostic model, the Accuracy, Specificity, Sensitivity and F1 scores of this model on the X-ray dataset have increased by 1.42%, 4.62%, 0.40% and 2.80% respectively, and the Accuracy, Specificity, Sensitivity and F1 scores on the CT image dataset have increased by 2.16%, 2.17%, 2.05% and 2.06% respectively. The SMViT model has strong generalization ability for small sample size datasets. Compared with ViT, SMViT model has smaller parameters and higher diagnostic performance.
    Reference | Related Articles | Metrics
    Abstract146
    PDF73
    Reserved Hierarchy-Based Knowledge Graph Embedding for Link Prediction
    QIAN Fulan, WANG Wenxue, ZHENG Wenjie, CHEN Jie, ZHAO Shu
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2174-2183.   DOI: 10.3778/j.issn.1673-9418.2207090
    Knowledge graph embedding (KGE) is an important tool to predict the missing links of knowledge graphs (KGs). It embeds entities and relations of KGs into continuous low-dimensional space and preserves the potential information of the original data as much as possible. Recently, some KGE methods model the common semantic hierarchies of KGs by utilizing polar coordinate system and improve the performance in link prediction task. However, they use the simple transform function and only focus on the hierarchical differences of entities when modeling relations, which limits the performance of the model. To address this issue, this paper proposes reserved hierarchy-based knowledge graph embeddings (RHKE). It considers the hierarchy of entity itself when modeling the relation. To be specific, RHKE proposes the mixed transform function, which contains a proportion item and a bias item. The transform function will be mainly affected by the proportion item or bias item when the hierarchy of entity is high or low. In addition, since the model loses the hierarchy of entity itself after the mixed transform, RHKE uses the hierarchy correction item, which is an additional information for the relation by combining the original hierarchy of the head and tail entity with different proportions. Experiments on several public datasets show that RHKE outperforms existing semantic hierarchical models in link prediction task.
    Reference | Related Articles | Metrics
    Abstract186
    PDF216
    Self-training Semi-supervised Learning Algorithm for New Class Detection
    HE Yulin, CHEN Jiaqi, HUANG Qihang, Philippe Fournier-Viger, HUANG Zhexue
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2184-2197.   DOI: 10.3778/j.issn.1673-9418.2206059
    The limited application scenario and unsatisfactory generalization capability are two main defects of traditional semi-supervised learning (SSL) algorithms. Especially, their prediction capabilities will be severely degraded when the training dataset includes the samples with new labels. It is usually time-consuming and expensive to label the unlabeled samples by the domain experts. In addition, the wrongly-labeled samples are unavoidable due to the insufficient background knowledge. Therefore, the SSL algorithms that can correctly label the unlabeled samples with unseen labels are urgent for practical applications. After analyzing the SSL algorithm in detail, an effective new class detection SSL (NCD-SSL) algorithm is proposed. Firstly, a universal incremental extreme learning machine is designed to deal with both class-incremental and sample-incremental classification problems. Secondly, the self-training model is improved by using the samples with high-confidence labels and setting a buffer pool to store the samples with low-confidence labels. Thirdly, the samples in buffer pool are further handled with clustering and distribution consistency judgement technologies so that the new classes can be detected. Finally, a series of persuasive experiments are conducted to validate the rationality and effectiveness of NCD-SSL algorithm on synthetic datasets and real datasests. Experimental results show that the testing accuracies of NCD-SSL algorithm are increased more than 30, 20 and 10 percentage points for 3-classes, 2-classes, 1-class missing cases in comparison with the other six popular SSL algorithms and thus demonstrate superior SSL performances of NCD-SSL algorithm.
    Reference | Related Articles | Metrics
    Abstract169
    PDF112
    Social Network Nodes Classification Method Based on Multi-information Fusion
    LIU Chao, LIANG Anting, LIU Xiaoyang, HUANG Xianying
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2198-2208.   DOI: 10.3778/j.issn.1673-9418.2206102
    For the problem of poor performance of social network nodes, a model of integrating multiple information graph convolutional networks (IMIGCN) for node classification is proposed. Firstly, using the eigenvector X and the adjacency matrix A, this paper respectively constructs the homogeneous matrix FA containing the homogeneous information between nodes and the co-citation matrix CoA containing the co-citation information. The triangular structure in the network is analyzed, and the triangular matrix TriA containing the triangular information between nodes is constructed by the transformation formula. On this basis, the information of the node itself is integrated. Next, this paper improves the traditional graph convolutional networks (GCN) model. The single kernel of GCN is improved to adaptive multi-kernel, and the results of multi-kernel learning are adaptively fused into one embedding through the attention mechanism, so as to achieve the effect of integrating multiple information at the same time by one convolution. In order to learn more information, the embedding in the model process is designed as multi-head, and the weight assignment of multi-head embedding is adaptively learned through multi-head embedding attention. Experimental results show that, compared with the existing node classification models with better performance, the classification accuracy of the proposed IMIGCN on social networks is improved by 0.0098 to 0.0532, and the F1 index is improved by 0.0127 to 0.0536, which proves that the proposed IMIGCN is reasonable and effective.
    Reference | Related Articles | Metrics
    Abstract214
    PDF123