Content of Big Data Technology in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Group Recommendation Model Based on User Common Intention and Social Interaction
    QIAN Zhongsheng, ZHANG Ding, LI Duanming, WANG Yahui, YAO Changsen, YU Qingyuan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1368-1382.   DOI: 10.3778/j.issn.1673-9418.2304025
    Existing group recommendation models often have a monotonous approach when solving user representation, and only simple social relationships between users are utilized. This makes user representation inaccurate and most models do not consider the impact of user common intention and social interaction on group preferences. As a result, recommended items are not aligned with user needs. To address these issues, a new group recommendation model based on user common intention and social interaction (GR-UCISI) is proposed. Firstly, a user intention separation model that combines user-item interaction history with social interaction is constructed. Graph neural networks are utilized to collect user-item interaction and social interaction information, and to solve user intention and item representation. Secondly, by utilizing the social network random walk algorithm and the [K-means] clustering algorithm, users can be grouped. User group, user intention and group intention aggregation process are combined to obtain group common intention representation. Finally, group common intention representation and item representation are calculated to obtain the list of recommended items for the group. This method fully considers the impact of user individuality and commonality among group members on group preferences. It also utilizes social relationships to alleviate the problem of data sparsity and improve model performance. The experimental results show that compared with the model with the best recommendation effect of nine models, on the Gowalla dataset, the Precision and NDCG of the GR-UCISI model are increased by 3.01% and 5.26% respectively, on the Yelp-2018 dataset, the Precision and NDCG of the GR-UCISI model are increased by 2.96% and 1.12% respectively.
    Reference | Related Articles | Metrics
    Abstract73
    PDF38
    Efficient Data Cleaning Framework for K-Nearest Neighbor Learning Models
    WANG Jingyi, CHEN Yinjia, YUAN Ye, CHEN Chen, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2241-2251.   DOI: 10.3778/j.issn.1673-9418.2207105
    Real-world datasets are often collected with missing data, and in order to build effective machine learning models on incomplete datasets, the datasets need to be cleaned. To ensure the quality of the cleaned datasets, human involvement is often required, which incurs considerable costs. Prioritizing the cleaning of incomplete data will help minimize cleaning scale and save labor costs. Calculating the priority needs determining the contribution of the incomplete data to the performance of the model. Shapley value is a popular method for evaluating the contribution, so it can be used to calculate the cleaning priority. Due to the lack of existing work on Shapley value of incomplete data, a representation of Shapley value of incomplete data is firstly defined based on the possible worlds of the datasets. And an approximation algorithm for calculating Shapley value of incomplete data in the K-nearest neighbor classification model in polynomial time is proposed based on the K-nearest neighbor utility. Finally, the ShapClean, a heuristic data cleaning algorithm based on Shapley value, is proposed. Experiments show that the algorithm can often significantly exceed the existing automatic cleaning algorithms in terms of the accuracy. And compared with data cleaning algorithms that also require human involvement, the ShapClean can save more labour costs while ensuring the desired model accuracy.
    Reference | Related Articles | Metrics
    Abstract128
    PDF138
    MaSS: Model Pricing Marketplace Based on Unit Data Contribution
    ZHANG Xiaowei, JIANG Dong, YUAN Ye, AN Lixia, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2252-2264.   DOI: 10.3778/j.issn.1673-9418.2207106
    Data-driven machine learning models have become ubiquitous. However, there is still very little research on how to promote the market development of the machine learning model. The existing research is mainly divided into two aspects, one is the interaction between the data owners and the brokers, that is, the compensation of the data owners. Another is the interaction between model buyers and brokers, that is, the expense of the model buyers. But for the model market, these issues are indivisible. Therefore, this paper takes a formal data marketplace perspective and proposes the novel model marketplace based on three-stage hierarchical Stackelberg game and Shapley value (MaSS). MaSS adopts a new utility evaluation index, Shapley value. And then this paper proposes a model trading framework of three-stage Stackelberg game based on Shapley value, including three-party parti-cipants: model buyers, brokers and data owners. The data owners provide the broker with private data. The brokers will further process the data into the models needed by the model buyers, and provide the model for the model buyer for profit. They interact with each other to form a Stackelberg game to maximize the profits of all involved in the transaction. And the uniqueness of the existence of equilibrium pricing strategy is proven theoretically. Finally, its remarkable performance is demonstrated by extensive simulations on real data.
    Reference | Related Articles | Metrics
    Abstract129
    PDF151
    TD-H2H: Shortest Path Query on Time-Dependent Graphs
    LI Xinling, WANG Yishu, YUAN Ye, GU Xiang, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (5): 1210-1224.   DOI: 10.3778/j.issn.1673-9418.2111064
    A shortest path query on road networks is a fundamental problem, which has been studied widely. Existing studies usually model road networks as a static graph and query the path with the shortest distance between given vertices. However, a road network has time-dependent property, so modeling the road network as a time-dependent graph is more realistic. Compared with the static graph, the time-dependent graph is larger and more complex, which increases the difficulty of time-dependent shortest path query. The time-dependent shortest path refers to the path with the shortest travel time between a source and a destination in a time-dependent graph under a given departure time. Therefore, the result of the time-dependent shortest path is impacted by the given departure time, which brings a new challenge for the query. These difficulties and challenges make the traditional shortest path algorithms not applicable to the time-dependent shortest path query. This paper employs a time-dependent graph to model the road network, and TD-H2H (time dependent-hierarchical 2-hop) index is proposed based on the tree decomposition, which can be used to quickly and accurately solve the time-dependent shortest path problem. Firstly, a time-dependent tree decomposition algorithm based on the traditional tree decomposition is presented to transform the time-dependent graph into a tree structure. Then, the index structure is quickly determined by tree decomposition, and an efficient index construction algorithm is proposed, denoted as TD-H2H. Finally, based on the TD-H2H index, an efficient time-dependent shortest path query algorithm is designed, named as TD-OAI. Experiments with existing algorithms are performed on 4 real-world datasets. Experimental results show that the query speed of the proposed algorithm is 1 to 2 orders of magnitude better than existing algorithms, and prove the effectiveness and efficiency of the proposed algorithms.
    Reference | Related Articles | Metrics
    Abstract249
    PDF98
    Integrating Time Context and Feature-Level Information for Recommendation Algorithm
    SHEN Yifeng, JIN Chenxi, WANG Yao, ZHANG Jiaxiang, LU Xianling
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 489-498.   DOI: 10.3778/j.issn.1673-9418.2105008
    Aiming at the problem that the sequence recommendation models based on the self-attention mechanism ignore all kinds of auxiliary information, which makes the model unable to use them to capture multi-level sequence relationship patterns, a recommendation algorithm integrating time context and feature-level information is proposed (ITFR). Firstly, the item representation is connected with each of its attribute representations and input into an attention network. After attention is weighted, an attribute-based item representation is obtained. Then, ITFR applies the self-attention block of the perception time interval and the self-attention block based on the item-attribute to capture the relationship pattern between the item and the interaction sequence time interval and the implicit relationship between the item and the attribute, respectively. Finally, the output representations of the two self-attention blocks are connected, and the joint output representation is input to the fully connected layer for the recommendation of the next item. Experiments are conducted on two public datasets, and two performance indicators of hit rate (HR) and normalized discounted cumulative gain (NDCG) are used for evaluation. In the Beauty dataset, compared with the optimal baseline method, HR@10 and NDCG@10 are increased by 4.6% and 5.1%, respectively. In the Movie-1M dataset, HR@10 and NDCG@10 are increased by 1.7% and 1.5%, respectively. Experimental results show that the method of incorporating auxiliary information to enhance sequence representation can improve recommendation performance.
    Reference | Related Articles | Metrics
    Abstract321
    PDF257
    Attention-aware Next Event Recommendation Strategy for Groups
    LIAO Guoqiong, YANG Lechuan, WAN Changxuan, LIU Dexi, LIU Xiping
    Journal of Frontiers of Computer Science and Technology    2023, 17 (2): 499-510.   DOI: 10.3778/j.issn.1673-9418.2107034
    In recent years, event-based social networks (EBSN) have gradually become an effective way for people to choose social events, and how to accurately recommend events to users or groups in need has become an important topic in this field. Next item recommendation can capture users’ dynamic preferences and has been well developed in e-commerce and other fields. However, there are less researches on next event recommendation for groups in EBSN. This paper mainly studies the group-oriented next event recommendation strategy, but due to the dynamic change of group preference, short event life cycle and cold start of new events, it is more difficult to recommend the next event for groups. Firstly, based on the characteristic that group preferences change dynamically over time, the history interaction records of group are divided into each period. Considering the sparsity of the member data due to the period division, which is unfavorable to group preference modeling, an engagement-based ranking strategy is proposed to extract the preferences of core members in the current period, and the attention mechanism is used to fuse them to the group static preferences. Then, the group dynamic preference is obtained by combining the static preference of each period with the attention-based sequential model. Finally, the multi-label classification problem is introduced into the event recommendation, which regards the contexts as labels of the event, and makes the model predict the probability distribution of each context to match the event, so as to alleviate the new event cold start problem. Experimental results verify that the proposed strategy has good performance.
    Reference | Related Articles | Metrics
    Abstract189
    PDF112
    Recommendation Algorithm Combining Social Relationship and Knowledge Graph
    GAO Yang, LIU Yuan
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 238-250.   DOI: 10.3778/j.issn.1673-9418.2112088
    Recommendation system can help users quickly find useful information and improve the retrieval efficiency of users effectively. However, the recommendation system has problems such as data sparsity and cold start, most of the existing recommendation algorithms that integrate social relations ignore the sparsity of social relations data, and there are few recommendation algorithms that integrate social relations and item attribute data at the same time. This paper proposes a recommendation model that is multi-task feature learning approach for social relationship and knowledge graph enhanced recommendation (MSAKR) in response to solve the above problems. Firstly, the algorithm extracts the user’s social relations through the graph convolutional neural network to get the user’s feature vector, then selects the neighbor by the graph centrality, and generates the virtual neighbor by the word2vec model, so as to alleviate the sparsity of the social data. This paper uses the attention mechanism to gather the neighbors. Secondly, multi-task learning and semantic-based matching model are used to extract the information of attribute knowledge graph to obtain the feature vector of the item. Finally, comprehensive recommendation is made to the user based on the obtained user and item feature vectors. In order to assess the performance of the recommendation algorithm, experiments are carried out on real datasets Douban and Yelp. Click-through rate predi-ction and Top-K recommendation are used to evaluate the performance of the model respectively. Experimental results show that the proposed model is superior to other benchmark models.
    Reference | Related Articles | Metrics
    Abstract503
    PDF352
    Probabilistic Recommendation Model Integrating Personality Features
    SHEN Tiesunlong, FU Xiaodong, YUE Kun, LIU Li, LIU Lijun
    Journal of Frontiers of Computer Science and Technology    2023, 17 (1): 251-262.   DOI: 10.3778/j.issn.1673-9418.2106111
    The model-based collaborative filtering algorithms need to analyze and extract the basic feature matrix of “user items” for user recommendation. Different user characteristics directly lead to different user first foundations. However, collaborative filtering algorithms for model purposes only consider the analysis and extraction of key factors that affect item characteristics, but not the important factors that affect user characteristics. This type of traditional model often initializes the user’s latent feature vector randomly and assigns an assumed normal distribution, resulting in no data changes in these models that can have a direct impact on the results of the user’s potential feature modeling. In addition, the user-based recommendation system model directly uses the user’s com-ments and ratings as user characteristics. The reference methods of these data in the traditional recommendation system and the data itself are not enough to support the acquisition of the essential characteristics of users. The approximation of these features also cannot meet the needs of personalized recommendation. Aiming at the problems of existing recommendation algorithms that there is no analysis to extract the essential features of users and the insufficient extraction of essential features of items, and the recommendation results are difficult to reflect the user’s personality, a recommendation model that integrates personality features is proposed. Firstly, according to the non-structure of users in the recommendation platform, the review text information is transformed, the persona-lity characteristics are used as the direct influencing factors of the user characteristics, a neural network model is designed to calculate the BIG FIVE personality score of the review user, and the personality score is vectorized as the user characteristics; then the item characteristics are obtained through the project review text information. This paper designs a collaborative learning framework for personality perception, and defines a loss function to obtain the feature vectors of users and items. Finally, target users are recommended based on the results of user and item characterization. A comprehensive experimental verification is carried out on 3 datasets. The results show that the algorithm outperforms the comparison algorithms in terms of prediction accuracy, F1 value, AUC index, etc. Through personality modeling, it can recommend items that are more in line with users?? preferences.
    Reference | Related Articles | Metrics
    Abstract162
    PDF87