[1] LIN J, YU W, ZHANG N, et al. A survey on Internet of things: architecture, enabling technologies, security and privacy, and applications[J]. IEEE Internet of Things Journal, 2017, 4(5): 1125-1142.
[2] 王金永, 黄志球, 杨德艳, 等. 面向无人驾驶时空同步约束制导的安全强化学习[J]. 计算机研究与发展, 2021, 58(12): 2585-2603.
WANG J Y, HUANG Z Q, YANG D Y, et al. Spatio-clock synchronous constraint guided safe reinforcement learning for autonomous driving[J]. Journal of Computer Research and Development, 2021, 58(12): 2585-2603.
[3] GAON M, BRAFMAN R. Reinforcement learning with non-Markovian rewards[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 3980-3987.
[4] RONG J, LUAN N. Safe reinforcement learning with policy-guided planning for autonomous driving[C]//Proceedings of the 2020 IEEE International Conference on Mechatronics and Automation. Piscataway: IEEE, 2020: 320-326.
[5] TAN Y, VURAN M C, GODDARD S. Spatio-temporal event model for cyber-physical systems[C]//Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems Workshops. Piscataway: IEEE, 2009: 44-50.
[6] CHOUHAN A P, BANDA G. Formal verification of heuristic autonomous intersection management using statistical model checking[J]. Sensors, 2020, 20(16): 4506.
[7] 祝义, 黄志球, 张广泉, 等. 一种支持实时软件时间建模的形式化方法[J]. 解放军理工大学学报(自然科学版), 2010(3): 274-278.
ZHU Y, HUANG Z Q, ZHANG G Q, et al. Formal method supporting for time modeling of real-time softwares[J]. Journal of PLA University of Science and Technology (Natural Science Edition), 2010(3): 274-278.
[8] 陈小颖, 祝义, 赵宇, 等. 面向CPS时空约束的资源建模及其安全性验证方法[J]. 软件学报, 2022, 33(8): 2815-2838.
CHEN X Y, ZHU Y, ZHAO Y, et al. Modeling and safety verification method for CPS time and topology constrained resources[J]. Journal of Software, 2022, 33(8): 2815-2838.
[9] GARCIA J, FERNÁNDEZ F. A comprehensive survey on safe reinforcement learning[J]. Journal of Machine Learning Research, 2015, 16: 1437-1480.
[10] KADOTA Y, KURANO M, YASUDA M. Discounted Markov decision processes with utility constraints[J]. Computers & Mathematics with Applications, 2006, 51(2): 279-284.
[11] GEIBEL P. Reinforcement learning for MDPs with constraints[C]//Proceedings of the 9th European Conference on Machine Learning. Berlin, Heidelberg: Springer, 2006: 646-653.
[12] MOLDOVAN T M, ABBEEL P. Safe exploration in Markov decision processes[C]//Proceedings of the 29th International Conference on Machine Learning. Madison: Omnipress, 2012: 1451-1458.
[13] HEGER M. Consideration of risk in reinforcement learning[C]//Proceedings of the 11th International Conference on Machine Learning. San Mateo: Morgan Kaufmann, 1994: 105-111.
[14] TAMAR A, MANNOR S, XU H. Scaling up robust mdps using function approximation[C]//Proceedings of the 31st International Conference on Machine Learning. Cambridge: MIT Press, 2014: 181-189.
[15] NILIM A, EL GHAOUI L. Robust control of Markov decision processes with uncertain transition matrices[J]. Operations Research, 2005, 53(5): 780-798.
[16] GAO Q, HAJINEZHAD D, ZHANG Y, et al. Reduced variance deep reinforcement learning with temporal logic specifications[C]//Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems. New York: ACM, 2019: 237-248.
[17] WANG C, LI Y, SMITH S L, et al. Continuous motion planning with temporal logic specifications using deep neural networks[EB/OL]. [2023-11-20]. https://arxiv.org/abs/2004.02610.
[18] WANG J, ZHANG Q, ZHAO D, et al. Lane change decision-making through deep reinforcement learning with rule-based constraints[C]//Proceedings of the 2019 International Joint Conference on Neural Networks. Piscataway: IEEE, 2019: 1-6.
[19] VAN HASSELT H, GUEZ A, SILVER D, et al. Deep reinforcement learning with double Q-learning[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2016: 2094-2100.
[20] KRASOWSKI H, WANG X, ALTHOFF M. Safe reinforcement learning for autonomous lane changing using set-based prediction[C]//Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems. Piscataway: IEEE, 2020: 1-7.
[21] WACHI A, SUI Y, WACHI A, et al. Safe reinforcement learning in constrained Markov decision processes[C]//Proceedings of the 37th International Conference on Machine Learning. New York: ACM, 2020: 9797-9806.
[22] LI T, LIU J, KANG J, et al. STSL: a novel spatio-temporal specification language for cyber-physical systems[C]//Proceedings of the 2020 IEEE 20th International Conference on Software Quality, Reliability and Security. Piscataway: IEEE, 2020: 309-319.
[23] SUTTON R S, BARTO A G. Reinforcement learning: an introduction[J]. IEEE Transactions on Neural Networks, 1998, 9(5): 1054.
[24] HOARE C A R. Communicating sequential process[J]. Communications of the ACM, 1978, 21(8): 666-677.
[25] REED G M, ROSCOE A W. A timed model for communicating sequential processes[J]. Theoretical Computer Science, 1988, 58(1/2/3): 249-261.
[26] DAVIES J, SCHNEIDER S. A brief history of timed CSP[J]. Theoretical Computer Science, 1995, 138(2): 243-271.
[27] RANDELL D A, CUI Z, COHN A G, et al. A spatial logic based on regions and connection[C]//Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning. New York: ACM, 1992: 165-176.
[28] 陈小颖, 祝义, 赵宇, 等. 面向CPS时空性质验证的混成AADL建模与模型转换方法[J]. 软件学报, 2021, 32(6): 1779-1798.
CHEN X Y, ZHU Y, ZHAO Y, et al. Hybrid AADL modeling and model transformation for CPS time and space properties verification[J]. Journal of Software, 2021, 32(6): 1779-1798.
[29] SCHNEIDER S. An operational semantics for timed CSP[J]. Information and Computation, 1995, 116(2): 193-213.
[30] 祝义, 黄志球, 曹子宁, 等. 一种基于形式化规约生成软件体系结构模型的方法[J]. 软件学报, 2010, 21(11): 2738-2751.
ZHU Y, HUANG Z Q, CAO Z N, et al. Method for generating software architecture models from formal specifications[J]. Journal of Software, 2010, 21(11): 2738-2751. |