Content of Big Data Technology in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Group Recommendation Model Based on User Common Intention and Social Interaction
    QIAN Zhongsheng, ZHANG Ding, LI Duanming, WANG Yahui, YAO Changsen, YU Qingyuan
    Journal of Frontiers of Computer Science and Technology    2024, 18 (5): 1368-1382.   DOI: 10.3778/j.issn.1673-9418.2304025
    Existing group recommendation models often have a monotonous approach when solving user representation, and only simple social relationships between users are utilized. This makes user representation inaccurate and most models do not consider the impact of user common intention and social interaction on group preferences. As a result, recommended items are not aligned with user needs. To address these issues, a new group recommendation model based on user common intention and social interaction (GR-UCISI) is proposed. Firstly, a user intention separation model that combines user-item interaction history with social interaction is constructed. Graph neural networks are utilized to collect user-item interaction and social interaction information, and to solve user intention and item representation. Secondly, by utilizing the social network random walk algorithm and the [K-means] clustering algorithm, users can be grouped. User group, user intention and group intention aggregation process are combined to obtain group common intention representation. Finally, group common intention representation and item representation are calculated to obtain the list of recommended items for the group. This method fully considers the impact of user individuality and commonality among group members on group preferences. It also utilizes social relationships to alleviate the problem of data sparsity and improve model performance. The experimental results show that compared with the model with the best recommendation effect of nine models, on the Gowalla dataset, the Precision and NDCG of the GR-UCISI model are increased by 3.01% and 5.26% respectively, on the Yelp-2018 dataset, the Precision and NDCG of the GR-UCISI model are increased by 2.96% and 1.12% respectively.
    Reference | Related Articles | Metrics
    Abstract88
    PDF71
    Efficient Data Cleaning Framework for K-Nearest Neighbor Learning Models
    WANG Jingyi, CHEN Yinjia, YUAN Ye, CHEN Chen, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2241-2251.   DOI: 10.3778/j.issn.1673-9418.2207105
    Real-world datasets are often collected with missing data, and in order to build effective machine learning models on incomplete datasets, the datasets need to be cleaned. To ensure the quality of the cleaned datasets, human involvement is often required, which incurs considerable costs. Prioritizing the cleaning of incomplete data will help minimize cleaning scale and save labor costs. Calculating the priority needs determining the contribution of the incomplete data to the performance of the model. Shapley value is a popular method for evaluating the contribution, so it can be used to calculate the cleaning priority. Due to the lack of existing work on Shapley value of incomplete data, a representation of Shapley value of incomplete data is firstly defined based on the possible worlds of the datasets. And an approximation algorithm for calculating Shapley value of incomplete data in the K-nearest neighbor classification model in polynomial time is proposed based on the K-nearest neighbor utility. Finally, the ShapClean, a heuristic data cleaning algorithm based on Shapley value, is proposed. Experiments show that the algorithm can often significantly exceed the existing automatic cleaning algorithms in terms of the accuracy. And compared with data cleaning algorithms that also require human involvement, the ShapClean can save more labour costs while ensuring the desired model accuracy.
    Reference | Related Articles | Metrics
    Abstract129
    PDF139
    MaSS: Model Pricing Marketplace Based on Unit Data Contribution
    ZHANG Xiaowei, JIANG Dong, YUAN Ye, AN Lixia, WANG Guoren
    Journal of Frontiers of Computer Science and Technology    2023, 17 (9): 2252-2264.   DOI: 10.3778/j.issn.1673-9418.2207106
    Data-driven machine learning models have become ubiquitous. However, there is still very little research on how to promote the market development of the machine learning model. The existing research is mainly divided into two aspects, one is the interaction between the data owners and the brokers, that is, the compensation of the data owners. Another is the interaction between model buyers and brokers, that is, the expense of the model buyers. But for the model market, these issues are indivisible. Therefore, this paper takes a formal data marketplace perspective and proposes the novel model marketplace based on three-stage hierarchical Stackelberg game and Shapley value (MaSS). MaSS adopts a new utility evaluation index, Shapley value. And then this paper proposes a model trading framework of three-stage Stackelberg game based on Shapley value, including three-party parti-cipants: model buyers, brokers and data owners. The data owners provide the broker with private data. The brokers will further process the data into the models needed by the model buyers, and provide the model for the model buyer for profit. They interact with each other to form a Stackelberg game to maximize the profits of all involved in the transaction. And the uniqueness of the existence of equilibrium pricing strategy is proven theoretically. Finally, its remarkable performance is demonstrated by extensive simulations on real data.
    Reference | Related Articles | Metrics
    Abstract129
    PDF151