使用HLS开发FPGA异构加速系统：问题、优化方法和机遇

doi:10.3778/j.issn.1673-9418.2210102

摘要/Abstract

摘要： 目前，现场可编程门阵列（field programmable gate array，FPGA）由于可编程性与出色的能效比受到了学术界与工业界的青睐，但是传统的基于硬件描述语言的FPGA开发方式面临编程挑战。硬件描述语言区别于通常使用的高级语言，阻碍了软件开发者对FPGA的利用。高层次综合（high-level synthesis，HLS）使得开发者可以从高级语言如C/C++层面直接进行FPGA硬件层面的开发，是解决这一问题的首选，受到了广泛的关注。近年来，学术界有许多关于HLS的工作，致力于解决HLS应用过程中的各类问题，并提升通过HLS开发的系统的性能。围绕使用HLS开发FPGA异构系统这一问题，以一种异构系统开发者的视角，列举了可行的优化方向。在编译优化层面，HLS工具可以通过插入编译指导与设计高效的空间探索算法，自动生成性能较高的RTL设计；在访存优化层面，HLS工具可以设立缓冲区，拆分并复制数据，以提升系统整体带宽；在并行优化层面，HLS工具可以实现语句级、任务级以及板卡级的并行。一些如DSL的技术虽然不能直接提升异构加速系统的性能，但是可以进一步提升HLS工具的可用性。最后，总结了当前HLS面临的一些挑战，并对HLS的未来研究方向进行了展望。

关键词: 现场可编程门阵列（FPGA）, 高层次综合, 异构系统, 高级语言, 编译优化

Abstract: Currently, field programmable gate arrays (FPGAs) are favored by both academia and industry due to their programmability and excellent energy efficiency ratio. However, traditional FPGA development based on hardware description languages faces programming challenges. Hardware description languages, which are different from commonly used high-level languages, hinder software developers from utilizing FPGAs. High-level synthesis (HLS) enables developers to directly develop FPGA hardware from high-level languages such as C/C++, and is widely regarded as the preferred solution to this problem. In recent years, there have been many works in academia on HLS, dedicated to solving various problems in the HLS application process and improving the performance of systems developed through HLS. This paper lists feasible optimization directions from the perspective of heterogeneous system developers around the issue of developing FPGA heterogeneous systems using HLS. At the compilation optimization level, HLS tools can automatically generate high-performance RTL designs by inserting compilation guidance and designing efficient spatial exploration algorithms. At the memory access optimization level, HLS tools can set up buffers, split and replicate data to improve the overall system bandwidth. At the parallel optimization level, HLS tools can implement statement-level, task-level and board-level parallelism. Meanwhile, some technologies such as DSL, although they cannot directly improve the performance of heterogeneous acceleration systems, can further enhance the usability of HLS tools. Finally, this paper summarizes some challenges currently faced by HLS and prospects the future research on HLS.

Key words: field programmable gate array (FPGA), high-level synthesis (HLS), heterogeneous system, high-level language, compiling optimization

徐诚, 郭进阳, 李超, 王靖, 汪陶磊, 赵杰茹. 使用HLS开发FPGA异构加速系统：问题、优化方法和机遇[J]. 计算机科学与探索, 2023, 17(8): 1729-1748.

XU Cheng, GUO Jinyang, LI Chao, WANG Jing, WANG Taolei, ZHAO Jieru. Using HLS to Develop FPGA Heterogeneous Acceleration System: Problems, Optimization Methods and Opportunities[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(8): 1729-1748.

参考文献

[1] Index CGC. Forecast and methodology 2018—2023 white paper[EB/OL]. (2018)[2023-01-12]. https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet- report/white-paper-c11-741490. html.
[2] 汤嘉武, 郑龙, 廖小飞, 等. 面向高性能图计算的高效高层次综合方法[J]. 计算机研究与发展, 2021, 58(3): 467-478.
TANG J W, ZHENG L, LIAO X F, et al. Effective high-level synthesis for high-performance graph processing[J]. Journal of Computer Research and Development, 2021, 58(3): 467-478.
[3] NIEMIEC G S, BATISTA L, SCHAEFFER-FILHO A, et al. A survey on FPGA support for the feasible execution of virtualized network functions[J]. IEEE Communications Surveys Tutorials, 2020, 22(1): 504-525.
[4] THOMAS D, MOORBY P. The Verilog? hardware description language[M]. Berlin, Heidelberg: Springer, 2008.
[5] SHAHDAD M. An overview of VHDL language and technology[C]//Proceedings of the 23rd ACM/IEEE Design Automation Conference, Las Vegas, Jun 1986. Washington: IEEE Computer Society, 1986: 320-326.
[6] CHO S, PATEL M, CHEN H, et al. A full-system VM-HDL co-simulation framework for servers with PCIe-connected FPGAs[C]//Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, Feb 25-27, 2018. New York: ACM, 2018: 87-96.
[7] NI N, PENG Y. Co-simulation framework of SystemC SoC virtual prototype and custom logic (abstract only)[C]//Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Monterey, Feb 11-13, 2013. New York: ACM, 2013: 278.
[8] Xilinx. Vivado design suite user guide high-level synthesis v2016.1[EB/OL]. (2021-05-04) [2023-01-12]. https://docs.xilinx.com/v/u/en-US/ug902-vivado-high-level-synthesis.
[9] Intel. Intel FPGA SDK for OpenCL Pro edition: programming guide[EB/OL]. (2019-04-22) [2023-01-12]. https://www.intel.com/content/www/us/en/docs/programmable/683846/19-1/introduction.html.
[10] PUTNAM A, CAULFIELD A, CHUNG E, et al. A reconfi-gurable fabric for accelerating large-scale datacenter services[J]. IEEE Micro, 2015, 35(3): 10-22.
[11] PUTNAM A. What to do with datacenter FPGAs besides deep learning[C]//Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, Feb 23-25, 2020. New York: ACM, 2020: 26.
[12] DU Z W, HERKLOTZ Y, RAMANATHAN N, et al. Fuzzing high-level synthesis tools[C]//Proceedings of the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, New York, Feb 28-Mar 2, 2021. New York: ACM, 2021: 148.
[13] NANE R, SIMA V, PILATO C, et al. A survey and evaluation of FPGA high-level synthesis tools[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(10): 1591-1604.
[14] 刘焰强, 戚正伟, 管海兵, FPGA 加速系统开发工具设计: 综述与实践[J]. 软件学报, 2020, 31(10): 3087-3099.
LIU Y Q, QI Z W, GUAN H B. FPGA acceleration system development tools: survey and practice[J]. Journal of Software, 2020, 31(10): 3087-3099.
[15] 郭进阳, 邵传明, 王靖, 等. FPGA图计算的编程与开发环境: 综述和探索[J]. 计算机研究与发展, 2020, 57(6): 1164-1178.
GUO J Y, SHAO C M, WANG J, et al. Programming and developing environment for FPGA graph processing: survey and exploration[J]. Journal of Computer Research and Development, 2020, 57(6): 1164-1178.
[16] CANIS A, CHOI J, ALDHAM M, et al. LegUp: high-level synthesis for FPGA-based processor/accelerator systems[C]//Proceedings of the 19th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, Feb 27-Mar 1, 2011. New York: ACM, 2011: 33-36.
[17] KOEPLINGER D, FELDMAN M, PRABHAKAR R, et al. Spatial: a language and compiler for application accelerators[C]//Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, Philadelphia, Jun 18-22, 2018. New York: ACM, 2018: 296-331.
[18] ZHAO J, FENG L, SINHA S, et al. COMBA: a comprehensive model-based analysis framework for high level synthesis of real applications[C]//Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, Irvine, Nov 13-16, 2017. Piscataway: IEEE, 2017: 430-437.
[19] JO G, KIM H, LEE J, et al. SOFF: an OpenCL high-level synthesis framework for FPGAs[C]//Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, Valencia, May 30-Jun 3, 2020. Piscataway: IEEE, 2020: 295-308.
[20] CHOI Y K, CHI Y, WANG J, et al. FLASH: fast, parallel, and accurate simulator for HLS[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(12): 4828-4841.
[21] ZHENG S, LIANG Y, WANG S, et al. FlexTensor: an automatic schedule exploration and optimization framework for tensor computation on heterogeneous system[C]//Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Mar 16-20, 2020. New York: ACM, 2020: 859-873.
[22] TARIQ O B, SHAN J, FLOROS G, et al. High-level annotation of routing congestion for Xilinx Vivado HLS designs[J]. IEEE Access, 2021, 9: 54286-54297.
[23] XU P F, ZHANG X F, HAO C, et al. AutoDNNchip: an automated DNN chip predictor and builder for both FPGAs and ASICs[C]//Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, Feb 23-25, 2020. New York: ACM, 2020: 40-50.
[24] MARGERM S, SHARIFIAN A, GUHA A, et al. TAPAS: generating parallel accelerators from parallel programs[C]//Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, Fukuoka, Oct 20-24, 2018. Washington: IEEE Computer Society, 2018: 245-257.
[25] ZHANG Q, WANG J Y, XU G Q, et al. HeteroGen: transpiling C to heterogeneous HLS code with automated test generation and program repair[C]//Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Feb 28-Mar 4, 2022. New York: ACM, 2022: 1017-1029.
[26] RASHID M I, SCH?FER B C. Improving the quality of hardware accelerators through automatic behavioral input language conversion in HLS[C]//Proceedings of the 27th Asia and South Pacific Design Automation Conference, Taipei, China, Jan 17-20, 2022. Piscataway: IEEE, 2022: 623-628.
[27] SOHRABIZADEH A, YU C H, GAO H, et al. AutoDSE: enabling software programmers design efficient FPGA accelerators[J]. arXiv:2009.14381, 2020.
[28] SUMEET N, DEEKSHA D, NAMBIAR M. HLS_Profiler: non-intrusive profiling tool for HLS based applications[C]//Proceedings of the 2022 ACM/SPEC International Conference on Performance Engineering, Beijing, Apr 9-13, 2022. New York: ACM, 2022: 187-198.
[29] SUN Q, CHEN T, LIU S, et al. Correlated multi-objective multi-fidelity optimization for HLS directives design[J]. ACM Transactions on Design Automation of Electronic Systems, 2022, 27(4): 31.
[30] GOSWAMI P, BHATIA D. Predicting post-route quality of results estimates for HLS designs using machine learning[C]//Proceedings of the 23rd International Symposium on Quality Electronic Design, Santa Clara, Apr 6-7, 2022. Piscataway: IEEE, 2022: 45-50.
[31] MENG P, ALTHOFF A, GAUTIER Q, et al. Adaptive threshold non-pareto elimination: re-thinking machine learning for system level design space exploration on FPGAs[C]//Proceedings of the 2016 Design, Automation Test in Europe Conference Exhibition, Dresden, Mar 14-18, 2016. Piscataway: IEEE, 2016: 918-923.
[32] KOEPLINGER D, PRABHAKAR R, ZHANG Y, et al. Automatic generation of efficient accelerators for reconfigurable hardware[C]//Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, Seoul, Jun 18-22, 2016. Washington: IEEE Computer Society, 2016: 115-127.
[33] KUPPANNAGARI S R, RAJAT R, KANNAN R, et al. IP cores for graph kernels on FPGAs[C]//Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, Waltham, Sep 24-26, 2019. Piscataway: IEEE, 2019: 1-7.
[34] Xilinx. ug998-vivado-intro-fpga-design-hls[EB/OL]. (2019-01-22)[2023-01-12]. https://www.xilinx.com/content/dam/xilinx/support/documents/sw_manuals/ug998-vivado-intro-fpga-design-hls.pdf.
[35] Xilinx. UltraScale architecture memory resources user guide[EB/OL]. (2021-09-24)[2023-01-12]. https://docs.xilinx.com/v/u/en-US/ug573-ultrascale-memory-resources.
[36] REICHE O, ?ZKAN M A, HANNIG F. et al. Loop parallelization techniques for FPGA accelerator synthesis[J]. Journal of Signal Processing Systems, 2018, 90(1): 3-27.
[37] PENG L, WANG Y, PENG Z, et al. Memory partitioning and scheduling co-optimization in behavioral synthesis[C]//Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, San Jose, Nov 5-8, 2012. Piscataway: IEEE, 2012: 488-495.
[38] WANG Y, LI P, ZHANG P, et al. Memory partitioning for multidimensional arrays in high-level synthesis[C]//Proceedings of the 50th Annual Design Automation Conference, New York, May 29-Jun 7, 2013. New York: ACM, 2013: 1-8.
[39] CHEN X Y, TAN H S, CHEN Y, et al. ThunderGP: HLS-based graph processing framework on FPGAs[C]//Proceedings of the 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Feb 28-Mar 2, 2021. New York: ACM, 2021: 69-80.
[40] WINTERSTEIN F, FLEMING K, YANG H J, et al. MATCHUP: memory abstractions for heap manipulating programs[C]//Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, Feb 22-24, 2015. New York: ACM, 2015: 136-145.
[41] FLEMING S T, THOMAS D B. Using runahead execution to hide memory latency in high level synthesis[C]//Proceedings of the 25th Annual International Symposium on Field-Programmable Custom Computing Machines, Napa, Apr 30-May 2, 2017. Piscataway: IEEE, 2017: 109-116.
[42] CONG J, WEI P, YU C H, et al. Bandwidth optimization through on-chip memory restructuring for HLS[C]//Proceedings of the 54th Annual Design Automation Conference, Austin, Jun 18-22, 2017. New York: ACM, 2017: 1-6.
[43] VOSS N, QUINTANA P, MENCER O, et al. Memory mapping for multi-die FPGAs[C]//Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, San Diego, Apr 28-May 1, 2019. Piscataway: IEEE, 2019: 78-86.
[44] GUO L, CHI Y, WANG J, et al. AutoBridge: coupling coarse-grained floorplanning and pipelining for high-frequency HLS design on multi-die FPGAs[C]//Proceedings of the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb 28-Mar 2, 2021. New York: ACM, 2021: 81-92.
[45] MILFORD M, MCALLISTER J. Constructive synthesis of memory-intensive accelerators for FPGA from nested loop kernels[J]. IEEE Transactions on Signal Processing, 2016, 64(16): 4152-4165.
[46] WANG Y, LI P, CONG J. Theory and algorithm for generalized memory partitioning in high-level synthesis[C]//Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, New York, Feb 26-28, 2014. New York: ACM, 2014: 199-208.
[47] Xilinx. UG1120-Alveo data center acclerator card platforms user guide (v1.3)[EB/OL]. (2022-08-26) [2023-01-12]. https://docs.xilinx.com/r/en-US/ug1120-alveo-platforms.
[48] JEDEC. High bandwidth memory (HBM) DRAM[EB/OL]. (2020) [2023-01-12]. https://www.jedec.org/standards-documents/docs/jesd235a.
[49] CHOI Y, CHI Y, QIAO W, et al. HBM connect: high-performance HLS interconnect for FPGA HBM[C]//Proceedings of the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb 28-Mar 2, 2021. New York: ACM, 2021: 116-126.
[50] RUAN Z Y, HE T, LI B J, et al. ST-Accel: a high-level programming platform for streaming applications on FPGA[C]//Proceedings of the 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines, Boulder, Apr 29-May 1, 2018. Washington: IEEE Computer Society, 2018: 9-16.
[51] ZOU Y, LIN M J. Graph-Morphing: exploiting hidden parallelism of non-stencil computation in high-level synthesis[C]//Proceedings of the 56th Annual Design Automation Conference 2019, Las Vegas, Jun 2-6, 2019. New York: ACM, 2019: 124.
[52] CHEN X Y, BAJAJ R, CHEN Y, et al. On-the-fly parallel data shuffling for graph processing on OpenCL-based FPGAs[C]//Proceedings of the 29th International Conference on Field Programmable Logic and Applications, Barcelona, Sep 8-12, 2019. Piscataway: IEEE, 2019: 67-73.
[53] KAPRE N, PATEL H. Applying models of computation to OpenCL pipes for FPGA computing[C]//Proceedings of the 5th International Workshop on OpenCL, Toronto, May 16-18, 2017. New York: ACM, 2017: 1-4.
[54] JIANG J C, WANG Z, LIU X, et al. Boyi: a systematic framework for automatically deciding the right execution model of OpenCL applications on FPGAs[C]// Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, Feb 3-25, 2020. New York: ACM, 2020: 299-309.
[55] ZHANG C, WU D, SUN J, et al. Energy-efficient CNN implementation on a deeply pipelined FPGA cluster[C]// Proceedings of the 2016 International Symposium on Low Power Electronics and Design, San Francisco, Aug 8-10, 2016. New York: ACM, 2016: 326-331.
[56] SUN Y, AMANO H. FiC-RNN: a multi-FPGA acceleration framework for deep recurrent neural networks[J]. IEICE Transactions on Information & Systems, 2020, 103-D(12): 2457-2462.
[57] Xilinx. Vivado HLS optimization methodology guide[EB/OL]. (2018-04-04) [2023-01-12]. https://docs.xilinx.com/v/u/en-US/ug1270-vivado-hls-opt-methodology-guide.
[58] LI J, CHI Y, CONG J. HeteroHalide: from image processing DSL to efficient FPGA acceleration[C]//Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, Feb 23-25, 2020. New York: ACM, 2020: 51-57.
[59] YE H, HAO C, CHENG J, et al. ScaleHLS: a new scalable high-level synthesis framework on multi-level intermediate representation[C]//Proceedings of the 2022 IEEE International Symposium on High-Performance Computer Architecture, Seoul, Apr 2-6, 2022. Piscataway: IEEE, 2022: 741-755.
[60] CHI Y, GUO L, CHOI Y, et al. Extending high-level synthesis for task-parallel programs[J]. arXiv:2009.11389, 2020.
[61] LI Z, LIU L, DENG Y, et al. Aggressive pipelining of irregular applications on reconfigurable hardware[C]//Proceedings of the 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture, Jun 24-28, 2017. New York: ACM, 2017: 575-586.
[62] XIANG S, LAI Y, ZHOU Y, et al. HeteroFlow: an accelerator programming model with decoupled data placement for software-defined FPGAs[C]//Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb 27-Mar 1, 2022. New York: ACM, 2022: 78-88.
[63] LIU J, KAFI A, SHEN X, et al. MKPipe: a compiler framework for optimizing multi-kernel workloads in OpenCL for FPGA[C]//Proceedings of the 34th ACM International Conference on Supercomputing, Barcelona, Jun 2020. New York: ACM, 2020: 39.
[64] Xilinx. Aurora 64B/66B v11. 2 LogiCORE IP product guide[EB/OL]. (2022-10-19) [2023-01-12]. https://docs.xilinx.com/r/en-US/pg074-aurora-64b66b.
[65] LAI Y H, CHI Y, HU Y, et al. HeteroCL: a multi-paradigm programming infrastructure for software-defined reconfigurable computing[C]//Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, Feb 24-26, 2019. New York: ACM, 2019: 242-251.
[66] ZHANG Y M, YANG M J, BAGHDADI R, et al. GraphIt: a high-performance graph DSL[J]. Proceedings of the ACM on Programming Languages, 2018, 1: 121.
[67] MEMBARTH R, REICHE O, HANNIG F, et al. HIPAcc: a domain-specific language and compiler for image processing[J]. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(1): 210-224.
[68] EMOTO K, MATSUZAKI K, HU Z, et al. Think like a vertex, behave like a function! A functional DSL for vertex-centric big graph processing[C]//Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Nara, Sep 18-22, 2016. New York: ACM, 2016: 200-213.
[69] LEI?A R, BOESCHE K, HACK S, et al. Shallow embedding of DSLs via online partial evaluation[C]//Proceedings of the 2015 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, Pittsburgh, Oct 26-27, 2015. New York: ACM, 2015: 11-20.
[70] SVENSSON B J, SHEERAN M, NEWTON R R. Design exploration through code-generating DSLs[J]. Communications of the ACM, 2014, 57(6): 56-63.
[71] HASTJARJANTO T, JEURING J, LEATHER S. A DSL for describing the artificial intelligence in real-time video games[C]//Proceedings of the 3rd International Workshop on Games and Software Engineering: Engineering Computer Games to Enable Positive, Progressive Change, San Francisco, May 18, 2013. Washington: IEEE Computer Society, 2013: 8-14.
[72] CHIW C, KINDLMANN G, REPPY J, et al. Diderot: a parallel DSL for image analysis and visualization[C]//Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, Beijing, Jun 11-16, 2012. New York: ACM, 2012: 111-120.
[73] LOLONG S, KISTIJANTORO A I. Domain specific language (DSL) development for desktop-based database application generator[C]//Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, Ban-dung, Jul 17-19, 2011. Piscataway: IEEE, 2011: 1-6.
[74] Xilinx. ChipScope Pro software and cores: user guide[EB/OL]. (2012-10-16) [2023-01-12]. https://www.xilinx.com/content/dam/ xilinx/support/documents/sw_manuals/xilinx14_7/chipscope_ pro_sw_cores_ug029.pdf.
[75] Altera. Quartus II handbook version 13.1 volume 3: verification 13 design debugging using the SignalTap II logic analyzer[EB/OL]. (2014-06-30)[2023-01-12]. https://class.ece.uw.edu/469/peckol/doc/Tutorials/SignalTap-qii53009.pdf.
[76] GOEDERS J, WILTON S J E. Effective FPGA debug for high-level synthesis generated circuits[C]//Proceedings of the 24th International Conference on Field Programmable Logic and Applications, Munich, Sep 2-4, 2014. Piscataway: IEEE, 2014: 1-8.
[77] YANG L, GURUMANI S, CHEN D, et al. AutoSLIDE: automatic source-level instrumentation and debugging for HLS[C]//Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, Washington, May 1-3, 2016. Washington: IEEE Computer Society, 2016: 127-130.
[78] MERLINI M A, POY I, CHOW P, et al. Interactive debugging at IP block interfaces in FPGAs[C]//Proceedings of the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb 28-Mar 2, 2021. New York: ACM, 2021: 138-144.
[79] CONG J, LIU B, NEUENDORFFER S, et al. High-level synthesis for FPGAs: from prototyping to deployment[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2011, 30(4): 473-491.
[80] FURKAN T, VERBAUWHEDE I. Trust in FPGA-accelerated cloud computing[J]. ACM Computing Surveys, 2021, 53(6): 128.
[81] JORDAN M G, KOROL G, RUTZIG M B, et al. Resource-aware collaborative allocation for CPU-FPGA cloud environments[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2021, 68(5): 1655-1659.
[82] PEREPELITSYN A, ZARIZENKO I, KULANOV V. FPGA as a service solutions development strategy[C]//Proceedings of the 11th IEEE International Conference on Dependable Systems, Services and Technologies, Kyiv, May 14-18, 2020. Piscataway: IEEE, 2020: 376-380.
[83] TARAS I, ANDERSON J H. Impact of FPGA architecture on area and performance of CGRA Overlays[C]//Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, San Diego, Apr 28-May 1, 2019. Piscataway: IEEE, 2019: 87-95.

编辑推荐 0

Metrics

阅读次数

全文

426

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	136	0	290

来源	本网站	其他网站

次数	319	107
比例	75%	25%

摘要

403

最新录用	在线预览	正式出版

99	0	304

	来源	本网站

	次数	403
	比例	100%