Loading...

Table of Content

    2025-04-01, Volume 19 Issue 4
    Frontiers·Surveys
    Review of Neural Network Lightweight
    DUAN Yuchen, FANG Zhenyu, ZHENG Jiangbin
    2025, 19(4):  835-853.  DOI: 10.3778/j.issn.1673-9418.2403071
    Abstract ( 326 )   PDF (1618KB) ( 222 )  
    References | Related Articles | Metrics
    With the continuous progress of deep learning technology, artificial neural network models have shown unprecedented performance in many fields such as image recognition, natural language processing, and autonomous driving. These models often have millions or even billions of parameters and learn complex feature representations through large amounts of training data. However, in resource-constrained environments, such as mobile devices, embedded systems and other edge computing scenarios, the power consumption, memory usage and computing efficiency of the model limit the application of large-scale neural network models. To solve this problem, the researchers have proposed a variety of model compression techniques, such as pruning, distillation, neural network search (NAS), quantization, and low-rank decomposition, which aim to reduce the number of parameters, computational complexity, and storage requirements of the model, while maintaining the accuracy of the model as much as possible. The following is a systematic introduction to the development process of these model compression methods, focusing on the main principles and key technologies of each method.    It mainly includes different strategies of pruning techniques, such as structured pruning and unstructured pruning; how to define knowledge in knowledge distillation; search space, search algorithm and network performance evaluation in NAS; post-training quantization and in-training quantization in quantization; and the singular value decomposition and tensor  decomposition in low rank decomposition. Finally, the future development direction of model compression technology is discussed.
    Review of Smart Contract Vulnerability Detection and Repair Research
    LIU Zhexu, LI Leixiao, LIU Dongjiang, DU Jinze, LIN Hao, SHI Jianping
    2025, 19(4):  854-876.  DOI: 10.3778/j.issn.1673-9418.2405019
    Abstract ( 131 )   PDF (3644KB) ( 72 )  
    References | Related Articles | Metrics
    The smart contract is a fundamental technology of blockchain, as it operates without the need for third-party authorities and can directly provide trusted customized services for users. It represents an important advancement in blockchain technology. As the application range of smart contracts continues to expand, ensuring their safe and reliable operation has become a pressing issue in the field of blockchain security. A research framework for smart contract vulnerability detection and repair is proposed, analyzing and summarizing the current research progress in four key aspects: vulnerability datasets, machine learning methods, vulnerability repair techniques, and patch deployment strategies. Firstly, this paper investigates machine learning-based smart contract vulnerability detection methods, comparing and summarizing 8 types of smart contract vulnerabilities, the current state of 15 open-source datasets, and the advantages and disadvantages of existing models, including traditional machine learning methods, deep learning approaches, and large models. Furthermore, a strategy for constructing high-quality smart contract vulnerability datasets is proposed, combining 5 types of vulnerability detection tools and confidence learning. The 5 types of vulnerability detection tools are symbolic execution, fuzz testing, taint analysis, formal verification, and integrated frameworks. Secondly, 3 categories of smart contract vulnerability repair solutions are systematically introduced: automated repair techniques, machine learning-based repair methods, and Ethereum enhancement technologies. A comprehensive comparison of different solutions is conducted, highlighting their respective advantages and limitations, along with an overview of relevant technologies that can be applied to smart contract vulnerability repair in the future. Finally, this paper analyzes existing security challenges in smart contracts and provides insights into future research directions.
    Advances in Node Importance Ranking Based on Graph Neural Networks
    CAO Lu, DING Cangfeng, MA Lerong, YAN Zhaoyao, YOU Hao, HONG Anqi
    2025, 19(4):  877-900.  DOI: 10.3778/j.issn.1673-9418.2405056
    Abstract ( 133 )   PDF (1582KB) ( 90 )  
    References | Related Articles | Metrics
    Node importance ranking is a critical task in graph analysis, as it plays a crucial role in identifying and prioritizing important nodes within a graph. Graph neural networks (GNNs) serve as an effective framework that leverages deep learning to directly comprehend the structural data of graphs, enabling comprehensive understanding of the internal patterns and deeper semantic features associated with nodes and edges. In the context of node importance ranking, GNNs can effectively harness graph structure information and node features to assess the significance of individual nodes. Compared with traditional node ranking methods, GNNs are better equipped to handle the diverse and intricate nature of graph structural data, capturing complex associations and semantic information between nodes while autonomously learning representations for node features. This reduces reliance on manual feature engineering, thereby enhancing accuracy in node importance ranking tasks. Consequently, approaches based on graph neural networks have emerged as the predominant direction for research into node importance. On this basis, this paper provides a classification of recent advancements in node ranking methods utilizing graph neural networks. This paper begins by revisiting core concepts related to node ranking, graph neural networks, and classical metrics for assessing node importance. It then summarizes recent developments in methods for evaluating node importance using graph neural networks. These techniques are categorized into four groups based on fundamental graph neural networks and their variants: basic GNNs, graph convolutional neural networks (GCNs), graph attention networks (GATs), and graph autoencoders (GAEs). Additionally, this paper analyzes the performance of these methods across various application domains, such as social networks, traffic networks, and knowledge graphs. Finally, it offers a comprehensive overview of existing research by analyzing time complexity along with advantages, limitations, and performance characteristics of current methodologies. Furthermore, it discusses future research directions based on identified shortcomings.
    Review of PCB Defect Detection Algorithm Based on Machine Vision
    YANG Sinian, CAO Lijia, YANG Yang, GUO Chuandong
    2025, 19(4):  901-915.  DOI: 10.3778/j.issn.1673-9418.2409061
    Abstract ( 181 )   PDF (1694KB) ( 119 )  
    References | Related Articles | Metrics
    Printed circuit board (PCB) as a core component of electronic products, its quality directly affects the reliability of the product. As electronic products move toward lighter, thinner, and more sophisticated, machine vision-based PCB defect detection faces challenges such as the difficulty of detecting tiny defects. In order to further study the PCB defect detection technology, the algorithms of each stage are discussed in detail according to the development history. Firstly, the main challenges in the field are pointed out, and traditional PCB defect detection methods and their limitations are introduced. Then, from the perspective of traditional machine learning and deep learning, this paper systematically reviews the PCB defect detection methods and their advantages and disadvantages in recent years. Next, this paper summarizes the commonly used evaluation indicators and mainstream datasets of PCB defect detection algorithms, compares the performance of the latest research methods on PCB-Defect, DeeP-PCB and HRIPCB datasets in the past three years, and analyzes the reasons for the differences. Finally, based on the current situation and the problems to be solved, the future development trend is prospected.
    Survey of Multi-domain Machine Translation Methods for Fine-Tuning Large Models
    CHEN Zijian, WANG Siriguleng, SI Qintu
    2025, 19(4):  916-928.  DOI: 10.3778/j.issn.1673-9418.2410032
    Abstract ( 135 )   PDF (1294KB) ( 88 )  
    References | Related Articles | Metrics
    With the rapid development of machine translation technology, machine translation methods based on pre-trained large models have occupied an important position in the field of natural language processing. However, due to the significant differences in language features, lexical styles and expressions between different domains, it is difficult for a single pre-trained model to achieve efficient and stable performance in multi-domain translation tasks. Therefore, this paper focuses on the key issues of large model fine-tuning technology in multi-domain machine translation tasks, systematically reviews the core principles, main methods and application effects of fine-tuning technology, and focuses on analyzing the performance and applicability scenarios of three types of strategies, namely full-parameter fine-tuning, parameter-efficient fine-tuning, and prompt-tuning. This paper discusses the advantages and limitations of different fine-tuning methods in depth, focusing on how to balance the domain generalization ability and task specificity through efficient fine-tuning strategies under resource-constrained conditions, and demonstrating the significant advantages of parameter-efficient fine-tuning and prompt-tuning in terms of resource utilization efficiency and domain adaptability. The practical effects of different fine-tuning strategies in terms of domain migration and resource utilization are further evaluated through comparative analysis and experimental validation, and their effectiveness is verified through case studies. Future research directions should focus on the efficient utilization of resources, the domain adaptive capability of models, and the improvement of translation quality and robustness, so as to promote the continuous development of multi-domain machine translation systems in terms of performance and adaptability.
    Theory·Algorithm
    Density Peak Clustering Algorithm Optimized by Weighted Shared Neighbors
    ZHANG Wenjie, XIE Juanying
    2025, 19(4):  929-944.  DOI: 10.3778/j.issn.1673-9418.2405064
    Abstract ( 49 )   PDF (1445KB) ( 74 )  
    References | Related Articles | Metrics
    DPC (clustering by fast search and find of density peaks) algorithm??s local density definition varies with the size of a dataset, the local density of a point is sensitive to the cutoff distance dc, and its single-step assignment strategy for the remaining points can cause the “domino effect”, resulting in its incapability in finding the genuine clustering in a dataset. To address the limitations, this paper proposes a density peak clustering algorithm based on weighted shared neighbors (WSN-DPC). This algorithm utilizes standard deviation weighted distance to enhance the Euclidean distance, thereby highlighting the contributions of different features to the distances between points. Additionally, shared neighbor information is used to define the similarities between points, and the local density and relative distance of a point are defined, so as to reflect the true distribution of points within a dataset as far as possible. Furthermore, distinct assignment strategies are employed in turn for outliers and non-outliers in the dataset, so as to guarantee that each point is to be assigned to its most appropriate cluster. Extensive experiments across multiple datasets and the statistically significant test demonstrate that the proposed WSN-DPC is superior to DPC and its variants, while addressing the limitations of DPC.
    Integration of Tangent Search and Competitive Mating in Zebra Optimization Algorithm and Its Application
    SU Chen, WANG Fangxiu, HUANG Zibo
    2025, 19(4):  945-963.  DOI: 10.3778/j.issn.1673-9418.2405072
    Abstract ( 83 )   PDF (6717KB) ( 53 )  
    References | Related Articles | Metrics
    Aiming at the premature convergence and local optima entrapment issues in the zebra optimization algorithm (ZOA), a tangent search-competitive mating zebra optimization algorithm (TZOA) is proposed. Firstly, the algorithm uses a tangent search strategy to increase the diversity of population to prevent the local optimal solution, and uses the hyperbolic cosine factor as a regulatory parameter to avoid affecting the convergence speed. Secondly, the grazing behavior of the wild horse optimizer algorithm (WHO) and the foraging behavior of the zebra optimization algorithm form a double group symbiotic strategy to improve the global exploration and local convergence capabilities of the early stage of the algorithm. Then, this paper adds a new competitive mating mechanism to further improve the diversity and local exploration scope of population. Finally, the experiment is conducted on the 10, 30 and 50 dimensions of 14 CEC2017 test functions with improving strategies, excellent algorithms in recent years, and improved ZOA algorithms proposed by other authors. The population diversity analysis, Wilcoxon rank sum test, exploration and exploitation analysis, and runtime comparison graphs are used to verify the performance of the algorithm. The experimental results demonstrate that the proposed TZOA exhibits superior optimization performance and higher solution accuracy compared with several other intelligent optimization algorithms. TZOA is simultaneously applied to robot path planning problems, and the results obtained from both simple and complex map tests are the best, further proving the effectiveness of the improved algorithm TZOA.
    Structured Sparsity Graph Learning for Unsupervised Feature Extraction
    ZHU Yike, DING Jianhao, YIN Xuesong, WANG Yigang
    2025, 19(4):  964-975.  DOI: 10.3778/j.issn.1673-9418.2406069
    Abstract ( 77 )   PDF (2785KB) ( 82 )  
    References | Related Articles | Metrics
    Unsupervised feature extraction has garnered increasing attention for alleviating the “curse of dimensionality” problem posed by high-dimensional data. However, existing methods typically construct low-rank graphs or nearest neighbor graphs to find the projection direction of high-dimensional data, overlooking the global structural correlation and sparsity of representation. To address these issues, a novel dimensionality reduction method called structured sparse graph learning-based unsupervised feature extraction (SSGL) is proposed. The SSGL method utilizes representation to construct nearest neighbor graphs between samples to preserve the local structure of the data and uses least squares regression to model the global structural correlation of the data. Consequently, the proposed SSGL can simultaneously preserve both the local and global structural correlations of the data. Moreover, SSGL employs sparse regularization to disconnect links between samples from different clusters in the affinity graph, thereby making the learned projection more discriminative. To validate the effectiveness of SSGL, extensive experiments are conducted on eight public image datasets. The results indicate that SSGL outperforms other advanced feature extraction methods in terms of clustering accuracy, significantly enhancing clustering results and classification performance.
    Graphics·Image
    Positional Enhancement TransUnet for Medical Image Segmentation
    ZHAO Liang, LIU Chen, WANG Chunyan
    2025, 19(4):  976-988.  DOI: 10.3778/j.issn.1673-9418.2406001
    Abstract ( 96 )   PDF (1709KB) ( 84 )  
    References | Related Articles | Metrics
    Medical image segmentation can assist doctors to quickly and accurately identify organs and lesions in medical images, which is of great value in improving the efficiency of clinical diagnosis. U-Net combined with Transformer is the mainstream method in the field of medical image segmentation. However, Transformer has weak ability to extract local information, and the U-Net structure will lose detailed location information during upsampling and downsampling. To address the above problems, this paper proposes a TransUnet medical image segmentation network with enhanced position information, PETransUnet. The network first uses the positional efficient attention block (PEA) to enhance the position information of features. Secondly, the dual attention bridge block (DAB) is used to make up for the semantic gap between the features in the encoding stage and the decoding stage. Finally, the cross-channel attention fusion block (CCAF) is used to reduce the position information lost during upsampling. The proposed method is validated on the publicly available Synapse dataset, achieving Dice coefficient of 82.92% and HD95 coefficient of 18.87%. On the ACDC dataset, a Dice coefficient of 90.73% is attained. On the LITS17 dataset, the Dice coefficients for liver and liver tumor segmentation are 94.85% and 74.47%, respectively. Comparative analysis with recent algorithms shows higher segmentation accuracy.
    Cross-Modal Multi-level Feature Fusion for Semantic Segmentation of Remote Sensing Images
    LI Zhijie, CHENG Xin, LI Changhua, GAO Yuan, XUE Jingyu, JIE Jun
    2025, 19(4):  989-1000.  DOI: 10.3778/j.issn.1673-9418.2403082
    Abstract ( 139 )   PDF (1134KB) ( 84 )  
    References | Related Articles | Metrics
    Multimodal semantic segmentation networks can leverage complementary information from different modalities to improve segmentation accuracy. Thus, they are highly promising for land cover classification. However, existing multimodal remote sensing image semantic segmentation models often overlook the geometric shape information of deep features and fail to fully utilize multi-layer features before fusion. This results in insufficient cross-modal feature extraction and suboptimal fusion effects. To address these issues, a remote sensing image semantic segmentation model based on multimodal feature extraction and multi-layer feature fusion is proposed. By constructing a dual-branch encoder, the model can separately extract spectral information from remote sensing images and elevation information from normalized digital surface model (nDSM), and deeply explore the geometric shape information of the nDSM. Furthermore, a cross-layer enrichment module is introduced to refine and enhance each layer??s features, making full use of multi-layer feature information from deep to shallow layers. The refined features are then processed through an attention feature fusion module for differential complementarity and cross-fusion, mitigating the differences between branch structures and fully exploiting the advantages of multimodal features, thereby improving the segmentation accuracy of remote sensing images. Experiments conducted on the ISPRS Vaihingen and Potsdam datasets demonstrate mF1 scores of 90.88% and 93.41%, respectively, and mean intersection over union (mIoU) scores of 83.49% and 87.85%, respectively. Compared with current mainstream algorithms, this model achieves more accurate semantic segmentation of remote sensing images.
    Edge-Segmentation Cross-Guided Camouflage Object Detection Network
    CHEN Peng, LI Xu, XIANG Dao’an, YU Xiaosheng
    2025, 19(4):  1001-1010.  DOI: 10.3778/j.issn.1673-9418.2403058
    Abstract ( 56 )   PDF (4351KB) ( 52 )  
    References | Related Articles | Metrics
    The camouflage object detection based on edge-aware model is one of the mainstream methods, and its core is to output edge prediction at an early stage, which can better locate and segment camouflage objects. However, in the camouflage object dataset, due to the high visual similarity between the camouflage object and the background environment, the quality of early edge prediction is very high, and the incorrect foreground prediction will lead to incomplete segmentation or even missing objects, resulting in poor camouflage object segmentation. To address this issue, an edge-segmentation cross-guided camouflage object detection network (ECGNet) is proposed. Firstly, the ConvNeXt model is used as the backbone network, and the feature channels are processed uniformly through 1×1 convolution, and the global context information is extracted at multiple scales. Secondly, a segmentation-induced edge fusion module and an edge-perception guided integrity aggregation module are designed to cross-fuse, focusing on the overall structure of the camouflage object, and continuously refining the segmentation features and edge features. Finally, by guiding the residual channel attention module, these connections and convolutions are used to better extract structural details from low-level features. Experimental results on the datasets CAMO, COD10K and NC4K show that ECGNet outperforms the other 22 representative models, and compared with HitNet, the performance of [Sα],[E?],[Fωβ] and [M] is improved by 0.019, 0.019, 0.018 and 0.009 on average.
    Artificial Intelligence·Pattern Recognition
    Frequency Domain mixup Augmentation and logit Compensation for Self-Supervised Multi-label Imbalanced Electrocardiogram Classification
    CAO Siyuan, CHEN Songcan
    2025, 19(4):  1011-1020.  DOI: 10.3778/j.issn.1673-9418.2405065
    Abstract ( 49 )   PDF (1381KB) ( 36 )  
    References | Related Articles | Metrics
    Self-supervised contrastive learning has been proven effective in learning good feature representations by contrasting views through data augmentation, followed by fine-tuning for downstream (classification) tasks. Thus, it obtains wide applications. Electrocardiogram (ECG), as a non-invasive, low-risk, and low-cost signal source for cardiovascular diseases, its classification aids in early prevention and precise treatment of conditions like arrhythmia. However, most existing methods for ECG representation learning only perform contrastive learning through temporal perturbation augmentation of examples, overlooking the potential utilization of frequency-domain information, leaving room for further improvement in representation quality. Therefore, a frequency domain mixup augmentation strategy is designed for ECG samples, which generates augmented samples by exchanging frequency domain information between samples to achieve contrastive learning, thus addressing the shortcomings of existing ECG representation learning. In the downstream fine-tuning stage, considering that ECG classification inherently involves a multi-label class imbalance problem, this paper proposes mitigating this issue by incorporating label frequencies into binary cross-entropy (BCE) loss as logit compensation. Finally, model evaluation is conducted on the CPSC2018 and Chapman datasets. Experimental results demonstrate that integrating the proposed method as an independent module into multiple baseline models improves performance in terms of AUC and mAP metrics. Particularly, significant enhancements are observed in the performance of certain rare disease indicators, thereby validating the effectiveness of this approach.
    Multimodal Rumor Detection Method Based on Multi-granularity Emotional Features of Image-Text
    LIU Xianbo, XIANG Ao, DU Yanhui
    2025, 19(4):  1021-1035.  DOI: 10.3778/j.issn.1673-9418.2406053
    Abstract ( 77 )   PDF (2538KB) ( 74 )  
    References | Related Articles | Metrics
    Rumors involving public safety, disasters, and other mass incidents often contain rich emotional features in text or images, which easily mobilize netizens’ emotional responses, inducing them to like, comment, and share. However, existing multimodal rumor detection methods lack effective extraction techniques for the emotional features contained in multimodal data and fail to consider the interrelationship between modalities during feature fusion, resulting in redundant and less accurate feature representations. To explore the role of cross-modal emotional features in rumor detection, a multimodal rumor detection method that integrates multi-granularity emotional features of image-text is proposed. Without relying on social information such as comments and dissemination patterns, this method integrates multi-granularity emotional features into the multimodal rumor detection process. It employs a cross-modal multi-granularity emotional feature fusion method based on an interactive attention mechanism to fully integrate deep features of multimedia information. To evaluate the effectiveness of the proposed method, comparative and ablation experiments are conducted on two public datasets, Weibo and Twitter. The results indicate that the proposed method improves rumor detection accuracy to 0.912 on the Weibo dataset and 0.839 on the Twitter dataset, showing superior performance across multiple metrics such as F1 value, effectively enhancing rumor detection performance and the interpretability of the model. To some extent, it can assist public security agencies in handling rumors during mass incidents, providing technical support for grassroots police operations.
    Deep Behavior and Semantic Network for 12345 Hotline Event Dispatch
    CHEN Shun, YI Xiuwen, ZHANG Junbo, LI Tianrui, ZHENG Yu
    2025, 19(4):  1036-1047.  DOI: 10.3778/j.issn.1673-9418.2403017
    Abstract ( 46 )   PDF (2064KB) ( 21 )  
    References | Related Articles | Metrics
    In China, citizens can seek help from the 12345 hotlines when they suffer from problems in daily life. After receiving requests from citizens, the hotline officer analyzes the demand of citizens and dispatches events to the corresponding government departments. Currently, the whole process mainly relies on manual work, which takes up a lot of human resources and leads to many incorrect dispatches. To improve the efficiency and accuracy of dispatching, in this paper, an efficient automatic data-driven event dispatch approach is proposed. Considering the historical dispatch records, event text and department responsibility, a deep behavior and semantic network (DBSN) for event dispatch is proposed. The network mainly consists of a history behavior encoding module, an event semantic learning module and a multi-dimension feature matching module. The history behavior encoding module builds a hierarchical bipartite graph network between different categories of events and departments, learning dispatch patterns through graph node embedding. The event semantic learning module uses the CNN and attention mechanism to learn the semantic information of event demand and department responsibility. The multi-dimension feature matching module matches events and departments from two dimensions including behavior and semantic features. Based on the 12345 Hotline data of a city, experimental results demonstrate the advantages of the proposed approach compared with baselines.
    Dual Graph Attention Model for Multivariate Time Series Anomaly Detection
    LI Hanzhang, YAN Xuanhui, LI Zhenli, YAN Yuwei, WANG Tingyin
    2025, 19(4):  1048-1064.  DOI: 10.3778/j.issn.1673-9418.2405053
    Abstract ( 40 )   PDF (4390KB) ( 45 )  
    References | Related Articles | Metrics
    Time series anomaly detection is a well-established research area within sequential tasks, achieving significant results in both academia and industry. Addressing the multi-dimensional deep features and complex inherent dependencies in multivariate time series data, a novel anomaly detection model integrating spatiotemporal features is proposed. The model employs a graph attention network structure composed of a temporal graph attention network (T-GAT) and a spatial graph attention network (F-GAT). T-GAT constructs a unidirectional weighted graph where edges represent temporal dependencies, simulating prior information about the temporal graph structure and integrating it into the network to capture time-related relationships. F-GAT converts time series into frequency domain sequences represented by amplitudes and establishes a global bi-directional weighted graph to simulate associations between multivariate elements, using regularization to maintain sparsity among neighboring nodes and ensure accurate capture of spatial relationships. The model incorporates a multi-dimensional attention mechanism to effectively mine and utilize deep features across different characteristics. A gated recurrent unit further processes the spatiotemporal information, integrating it into comprehensive features, with anomalies identified by differences between predicted and observed values. Experimental results on four public datasets demonstrate that the model achieves advanced performance among twelve comparison models with superior F1 scores, and ablation studies confirm that the dual graph structure and attention mechanisms significantly enhance anomaly detection accuracy, effectively identifying anomalies in time series data.
    Incorporating Dynamic Convolution and Attention Mechanism in Multilayer Perceptron for Speech Emotion Recognition
    ZHANG Yumeng, ZHANG Xin, GAO Mou, ZHAO Hulin
    2025, 19(4):  1065-1075.  DOI: 10.3778/j.issn.1673-9418.2406008
    Abstract ( 68 )   PDF (3186KB) ( 59 )  
    References | Related Articles | Metrics
    Speech emotion recognition technology infers the speaker’s emotions by analyzing the vocal signals, enhancing the naturalness and intelligence of human-computer interaction. However, existing models often overlook the semantic information of time and frequency, affecting the recognition accuracy. To address this problem, a multi-layer perceptron model that integrates dynamic convolution and attention mechanisms has been proposed, significantly improving the accuracy of emotion recognition and the efficiency of information utilization. Firstly, the input speech signals are transformed into a Mel-spectrogram to capture detailed signal variations and more accurately reflect human perception of sound, laying foundation for subsequent feature extraction. The Mel-spectrogram is then tokenized to reduce data complexity. Next, dynamic convolution and split attention mechanisms are employed to extract key temporal-frequency features efficiently. Dynamic convolution adapts to scale changes across different time and frequency domains, thereby enhancing the efficiency of capturing features. Meanwhile, the split attention mechanism enhances the ability of the model to focus on crucial information, effectively improving the feature expressive capability. By combining the advantages of dynamic convolution and split attention mechanisms, the proposed model can fully extract crucial acoustic features, thereby achieving more efficient and accurate emotion recognition. Experiments conducted on the RAVDESS, EmoDB, and CASIA speech emotion databases show that the recognition accuracy of the proposed model significantly surpasses existing technologies, reaching 86.11%, 95.33%, and 82.92%. This verifies the effectiveness of the proposed model in complex emotion recognition tasks, as well as the efficacy of dynamic convolution and attention mechanisms.
    End-to-End Synthetic Speech Detection Based on Improved Deep Residual Shrinkage Networks
    ZENG Gaojun, LU Tianliang, REN Yingjie, LI Yujin, PENG Shufan
    2025, 19(4):  1076-1086.  DOI: 10.3778/j.issn.1673-9418.2404088
    Abstract ( 33 )   PDF (2323KB) ( 26 )  
    References | Related Articles | Metrics
    The misuse of synthetic speech has led to numerous real-world problems. Researching corresponding anti-counterfeiting techniques is of great significance for protecting the personal and property safety of citizens and ensuring social and national security. Traditional synthetic speech detection often combines manually designed features with backend classifiers. The manual front-end features involve complex prior knowledge, and using a single manual feature model yields unsatisfactory detection results. However, fusing multiple features leads to a large number of model parameters. Moreover, most detection methods suffer from poor generalization across datasets. To address these issues, an end-to-end synthetic speech detection method based on an improved deep residual contraction network is proposed. Firstly, a channel attention mechanism is integrated to redesign the adaptive threshold learning module, improving the accuracy of threshold learning. Secondly, a frame attention mechanism module is designed and introduced to assign different attention levels to different frames, enhancing the model’s feature selection capability. Then, an improved wavelet threshold function with two hyperparameters is designed and introduced to enhance the ability of the thresholding module to suppress irrelevant features. Finally, an end-to-end synthetic speech detection network based on the improved deep residual contraction network is designed, which can determine whether the input raw speech is synthetic speech. Comparative experimental results based on the ASVspoof2019 LA dataset show that the proposed method reduces the equal error rate and minimum concatenated detection cost function of the baseline model by 85% and 84%, respectively. Cross-database testing results based on the ASVspoof2015 LA dataset validate the generalization performance of the proposed method.
    Network·Security
    Surname Password Guessing Method Based on GPT-2
    LIN Jiaxi, QIAN Qiuyan, ZENG Jianping, ZHANG Weidong
    2025, 19(4):  1087-1094.  DOI: 10.3778/j.issn.1673-9418.2407028
    Abstract ( 39 )   PDF (1085KB) ( 28 )  
    References | Related Articles | Metrics
    As authentication mechanisms diversify, passwords, as a traditional and widely adopted authentication method, face severe security challenges. Due to linguistic characteristics and cultural differences, Chinese users?? password choices differ significantly from those of English-speaking users, providing new perspectives for guessability attacks. To address this issue, this paper proposes a Chinese surname-based password guessing method using the GPT-2 model, aiming to effectively enhance the guessing capability for Chinese passwords. The proposed method employs unsupervised fine-tuning to enable the pre-trained language model to generate passwords closely related to surnames. To compensate for GPT-2??s lack of support for Chinese characters, this model leverages a news corpus as the pre-training dataset, converting Chinese text into Pinyin and training the model to recognize Pinyin, thereby helping the model more accurately understand Chinese users' password habits. Experimental results demonstrate that the proposed model exhibits superior performance in password guessing tasks, particularly in resource-constrained environments, achieving higher success rates compared with traditional guessing methods and deep learning-based password attack techniques. Additionally, this paper explores the impact of temperature parameters on the success rate of password guessing, identifying potential directions for further improving password security.
    S-Box Construction and Optimization Method Based on Composite Chaotic System
    WU Xiaonian, WU Ting, HUANG Zhaowen, ZHANG Runlian
    2025, 19(4):  1095-1104.  DOI: 10.3778/j.issn.1673-9418.2406085
    Abstract ( 55 )   PDF (2419KB) ( 43 )  
    References | Related Articles | Metrics
    The S-box is the only nonlinear component of block cipher, and its merit determines the security strength of the cryptographic algorithm. In order to efficiently construct S-boxes with excellent and stable cryptographic properties, an 8-bit S-box construction and optimization method based on composite chaotic system is proposed. Firstly, an extended tent mapping is given by extending the value domain of the tent chaotic mapping, and a composite chaotic system with excellent chaotic properties is constructed by combining the extended tent mapping with the extended logistic mapping. Subsequently, after 50 iterations to eliminate the transient effects of chaotic systems, the composite chaotic system is used to generate random sequences to construct initial 8-bit S-boxes. Furthermore, for the initial S-boxes with poor cryptographic properties, an optimization objective constraint function is designed to trade-off the relationship between the differential uniformity and linearity of the S-boxes, and the iterative optimization methods, searching for the data that make the differential distribution and linear distribution of the S-boxes more uniform according to the differential and linear distributions of the S-boxes, are carried out to lower the differential uniformity and linearity of the S-boxes as much as possible, and improve the ability of S-box to resist differential analysis and linear analysis. The experimental results show that the method can optimize all the initial S-boxes with poor cryptographic properties, the differential uniformity reaches 8, and the nonlinearity reaches 102. And the proposed method has a fast optimization speed, requiring at least 33 iterations to complete the optimization.
    Big Data Technology
    Social Knowledge-Aware Network Recommendation Algorithm
    JIN Haibo, FENG Yujing
    2025, 19(4):  1105-1114.  DOI: 10.3778/j.issn.1673-9418.2403047
    Abstract ( 70 )   PDF (1691KB) ( 53 )  
    References | Related Articles | Metrics
    The recommendation algorithm can quickly discover items that users like and make effective recommendations, thus greatly saving users’ search time. However, although existing recommendation algorithms can make recommendations based on characteristics such as user preferences or item similarity, there are still problems such as cold starts of users and items, and data noise. In order to solve the above problems, a social knowledge-aware network recommendation algorithm (SAGN) is proposed. This algorithm injects the knowledge of interdependence between items and users and the knowledge of correlation between users into the feature calculation of users and items. On the user side, this paper uses knowledge-aware networks to calculate the browsing records of users and their friends to obtain multiple preference features, combined with adaptive attention gating mechanism to generate user preference feature vectors; on the item side, this paper obtains a set of user friends associated with the item to be predicted, uses their browsing history as the initial entity set of the item, and uses the knowledge-aware network to extract item feature vectors based on the preferences of the user and his friends. In order to verify the effectiveness of the algorithm, comparative experiments are conducted on the  real datasets Ciao and Epinions with algorithms such as SocialFD, GraphRec, SREPS, HGCL, and KR-GCN. Experimental results show that compared with the best-performing model, the RMSE and MAE of the SAGN algorithm on the Epinions dataset are increased by 2.14% and 1.74% respectively; the RMSE and MAE on the Ciao dataset are increased by 1.81% and 1.79% respectively.