Content of System Software and Software Engineering in our journal

        Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Code Search Combining Graph Embedding and Attention Mechanism
    HUANG Siyuan, ZHAO Yuhai, LIANG Yiming
    Journal of Frontiers of Computer Science and Technology    2022, 16 (4): 844-854.   DOI: 10.3778/j.issn.1673-9418.2010087

    The source code retrieval task refers to using natural language as a query statement to search for relevant code fragments in the code base. In code retrieval task, most code retrieval algorithms only consider the text sequence information of the code snippets without considering the structural information of the code, resulting in the inability to fully capture the semantic and grammatical information contained in the code snippets. In order to improve the understanding of programming languages, a code retrieval algorithm (GraphCS) based on the combination of attention mechanism and graph embedding is proposed. In the feature extraction part, LSTM is used to extract the text feature vector representation, and Graph2Vec is used to extract the graph vector feature representation. The attention mechanism is introduced in the feature fusion part to better assign corresponding weights to each feature, thereby improving the understanding of the program. Considering heterogeneous data in source code and natural language, the code fragment features and natural language features are mapped to the same vector space, and ranking loss is used to ensure that the semantically similar points have a closer distance in the feature space. In order to verify the efficiency of the algorithm, it is compared with the best algorithm CODEnn. Experimental results show that there is a certain improvement in Precision@1/5/10, SuccessRate@1/5/10 and MRR.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract547
    PDF290
    HTML65
    UCTB: Spatiotemporal Crowd Flow Prediction Toolbox
    CHEN Liyue, CHAI Di, WANG Leye
    Journal of Frontiers of Computer Science and Technology    2022, 16 (4): 835-843.   DOI: 10.3778/j.issn.1673-9418.2012072

    Spatiotemporal crowd flow prediction is one of the key technologies in smart cities. There are mainly two major shortcomings that plague researchers and developers. Firstly, crowd flow is affected by complex factors and previous studies have summarized a variety of spatiotemporal prior knowledge. However, it is difficult for follow-up work to comprehensively incorporate this prior knowledge as its application scenarios are diverse. Secondly, with the development of deep learning technology, implementing state-of-the-art models is cumbersome work and becomes more and more complicated. To fill in the above gaps, this paper designs time-series sampling interfaces and graph construction interfaces. The time series sampling interfaces can generate different types of time series based on diverse prior knowledge, and the graph construction interfaces can build different types of spatial graphs. Moreover, users can extend the above two interfaces to utilize new spatiotemporal prior knowledge. Based on the TensorFlow framework, this paper implements a variety of advanced spatiotemporal graph models and encapsulates the frequently-used spatiotemporal modeling units. Users can leverage state-of-the-art spatiotemporal models and perform customized development based on these advanced layers. In summary, the spatiotemporal crowd flow prediction tool box UCTB integrates diverse spatiotemporal prior knowledge and a variety of advanced models, which may promote the development of spatiotemporal crowd flow prediction applications. The codes and detailed documents are open-source. The URL of UCTB is https://github.com/uctb/UCTB.

    Table and Figures | Reference | Related Articles | Metrics
    Abstract716
    PDF530
    HTML132
    Constructing Cache Algorithm for Flash by Leveraging Page Reconstruction and Data Temperature Recognition
    ZENG Xiangwei, DENG Yuhui
    Journal of Frontiers of Computer Science and Technology    2021, 15 (1): 84-95.   DOI: 10.3778/j.issn.1673-9418.2003014

    NAND flash based solid state disks (SSD) have better performance than magnetic disks and are gradually replacing hard disks in desktop systems. However, although DRAM is embedded in SSD as a buffer, SSD may also produce unstable write performance with continuous writing, because non-overwrite write and garbage collection(GC) operations are frequently triggered when physical pages are written. Aiming at this problem, a new flash buffer management strategy called PRLRU (least recently used algorithm by page reconstruction) is proposed, which manages the buffer through the page reconstruction mechanism and the data temperature recognition mechanism. The page reconfiguration mechanism combines pages with valid data that are less than one page size with other pages and then writes back to the flash memory to reduce the actual write operation by minimizing the number of non-overwrite write operations. The data temperature recognition mechanism performs temperature level marking on the cached pages, and then writes back the pages in a predetermined priority order. Test it with real workloads, the experimental results show that PRLRU can effectively improve SSD performance and prolong SSD lifetime. Compared with LRU, BPLRU and 2QW-Clock, the write performance of PRLRU is increased by 34.5%, 22.8% and 28.8%, respectively on average, the read performance increased by 12.5%, 10.6% and 8.3%, respectively on average, and the amount of garbage collection decreased by 10.5%, 8.7% and 6.3%, respectively on average.

    Reference | Related Articles | Metrics
    Abstract272
    PDF320
    Distributed Storage Method for Equipment Data Based on Pre-partitioning Strategy
    GAO Jian, WEI Jun, XU Lijie, WANG Baolong, YANG Fuxue, HUANG Xiaofei
    Journal of Frontiers of Computer Science and Technology    2021, 15 (1): 96-108.   DOI: 10.3778/j.issn.1673-9418.2002019

    With the development of sensor technology and computer technology, equipment can produce a large amount of data during the development and production process. These data are massive, multi-source, and heterogeneous. Enterprises need to consider how to effectively manage large amount of equipment data and use processed data to enhance manufacturing capabilities. This paper studies the data of typical equipment, such as satellite, airplane, etc., and proposes a distributed data storage method based on a pre-partitioning strategy. Through studying the pre-partitioning mechanism of HBase and the characteristics of equipment data, this paper studies impact factors for rapid storage of equipment data, and proposes a fast storage algorithm, which can store large amount of data into HBase in a balanced and fast manner. Finally, this paper evaluates the data storage performance, load balance, and applicability of various types of equipment. The experimental results show that this method can be applied to many types of equipment and has good performance in data storage efficiency.

    Reference | Related Articles | Metrics
    Abstract228
    PDF348