Computer Science and Information Systems 2024 Volume 21, Issue 4, Pages: 1335-1357
https://doi.org/10.2298/CSIS231129035C
Full text (
495 KB)
A method for solving reconfiguration blueprints based on multi-agent reinforcement learning
Cheng Jing (School of Computer Science and Engineering, Xi’an Technological University, Xi’an, China), chengjing@xatu.edu.cn
Tan Wen (School of Computer Science and Engineering, Xi’an Technological University, Xi’an, China), tanwen@st.xatu.edu.cn
Lv Guangzhe (Xi’an Institute of Aeronautical Computing Technology (XICT), Xi’an, China), guangzhe lv@163.com
Li Guodong (Aviation Industry First Aircraft Design and Research Institute, Xi’an, China), guodonglee@.com
Zhang Wentao (School of Software, Northwestern Polytechnical University, Xi’an, China), wentaoz@gmail.com
Liu Zihao (Leihua Electronic Technology Research Institute of Aviation Industry Corporation of China (AVIC), Wuxi, China), 2020264414@mail.nwpu.edu.cn
Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.
Keywords: Integrated modular avionics system, Multi-Agent Reinforcement Learning, reconfiguration blueprint, multi-objective optimization problem
Show references
Hubbard, P.D.: Fault management via dynamic reconfiguration for integrated modular avionics. Computer Science (2015)
Chen, J., Du, C., Han, P.: Scheduling independent partitions in integrated modular avionics systems. PLOS ONE 11(12), e0168064 (2016)
Wang, P., Zhao, C., Yan, F.: Research on the reliability analysis of the integrated modular avionics system based on the aadl error model. International Journal of Aerospace Engineering (2018)
Burger, S., Hummel, O.: Towards automatic reconfiguration of aviation software systems. In: 2011 IEEE 35th Annual Computer Software and Applications Conference Workshops. pp. 200-205. IEEE, Munich, Germany (2011)
Burke, M., Audsley, N.: Distributed fault-tolerant avionic systems - a real-time perspective. arXiv - CS - Distributed, Parallel, and Cluster Computing (2010)
Wang, H., Niu,W.: A review on key technologies of the distributed integrated modular avionics system. International Journal of Wireless Information Networks 25(12), 358-369 (2018)
He, D., Qiao, Q., Gao, J., Chan, S., Zheng, K., Guizani, N.: Simulation design for security testing of integrated modular avionics systems. IEEE Network 34(1), 159-165 (2020)
Gui, S., Luo, L., Tang, S., Meng, Y.: Optimal static partition configuration in arinc653 system. Journal of Electronic Science and Technology 9(4) (2011)
Blikstad, M., Karlsson, E., Lööw, T., Rönnberg, E.: An optimisation approach for pre-runtime scheduling of tasks and communication in an integrated modular avionic system. Optimization and Engineering 19, 977-1004 (2018)
Cui, Y., Shi, J., Wang, Z.: Backward reconfiguration management for modular avionic reconfigurable systems. IEEE Systems Journal 12(1), 137-148 (2018)
da Fontoura, A.A., do Nascimento, F.A.M., Nadjm-Tehrani, S., de Freitas, E.P.: Timing assurance of avionic reconfiguration schemes using formal analysis. IEEE Transactions on Aerospace and Electronic Systems 56(1), 95-106 (2020)
Suo, D., Zhu, J., An, J.: A new approach to improve safety of reconfiguration in integrated modular avionics. In: 2011 IEEE/AIAA 30th Digital Avionics Systems Conference. pp. 1C4- 1-1C4-12. IEEE, Seattle, WA, USA (2011)
Huseyinov, I., Bayrakdar, A.: Novel nsga-ii and spea2 algorithms for bi-objective inventory optimization. Studies in Informatics and Control 31(3), 31-42 (2022)
He, Z., Li, J., Wu, F., Shi, H., Hwang, K.S.: Derl: Coupling decomposition in action space for reinforcement learning task. IEEE Transactions on Emerging Topics in Computational Intelligence (2023)
Housseyni, W., Mosbahi, O., Khalgui, M., Li, Z., Yin, L., Chetto, M.: Multiagent architecture for distributed adaptive scheduling of reconfigurable real-time tasks with energy harvesting constraints. IEEE Access 6, 2068-2084 (2017)
Li, J., Shi, H., Hwang, K.: Using goal-conditioned reinforcement learning with deep imitation to control robot arm in flexible flat cable assembly task. IEEE Transactions on Automation Science and Engineering (2023)
Shi, H., Li, J., Mao, J., Hwang, K.: Lateral transfer learning for multiagent reinforcement learning. IEEE Transactions on Cybernetics (2021)
Jang, B., Kim, M., Harerimana, G., Kim, J.W.: Q-learning algorithms: A comprehensive classification and applications. IEEE access 7, 133653-133667 (2019)
Yang, Y., Juntao, L., Lingling, P.: Multi-robot path planning based on a deep reinforcement learning dqn algorithm. CAAI Transactions on Intelligence Technology 5(3), 177-183 (2020)
Sunehag, P., Lever, G., Gruslys, A., Czarnecki,W.M., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., Graepel, T.: Value-decomposition networks for cooperative multi-agent learning. arXiv:1706.05296 (2017)
Zhou, Q., Gu, T., Hong, R., Wang, S.: An aadl-based design for dynamic reconfiguration of dima. In: 2013 IEEE/AIAA 32nd Digital Avionics Systems Conference (DASC). pp. 4C1-1- 4C1-8. IEEE, East Syracuse, NY (2013)
Saadi, A., Oussalah, M., Hammal, Y.: A csp-based approach for managing the dynamic reconfiguration of software architecture. International journal of information technologies and systems approach 14(1), 18 (2021)
Chen, L., Wang, L.: Research on functional hazard analysis method for ima reconfiguration. Computer Engineering (2016)
Kim, H., Kim, J.: A load balancing scheme for gaming server applying reinforcement learning in iot. Computer Science and Information Systems 17(3), 891-906 (2020)
Nguyen, N.T., Luu, L., Vo, P.L., Nguyen, T.T.S., Do, C.T., Nguyen, N.: Reinforcement learning - based adaptation and scheduling methods for multi-source dash. Computer Science and Information Systems 20(1), 157-173 (2023)
Zhang, T., Chen, J., Lv, D., Liu, Y., Zhang,W., Ma, C.: Automatic generation of reconfiguration blueprints for ima systems using reinforcement learning. IEEE embedded systems letters 13(4), 182-185 (2021)
Liu, B., Zhang, Q., Wang, S.: Automatic generation of reconfiguration blueprints for ima systems using reinforcement learning. In: The 4th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent. pp. 664-668. IEEE, Hong Kong, China (2014)
Hollow, P., McDermid, J., Nicholson, M.: Approaches to certification of reconfigurable ima systems. The International Council on Systems pp. 1-8 (2000)
Gaska, T., Watkin, C., Chen, Y.: Integrated modular avionics-past, present, and future. IEEE Aerospace and Electronic Systems Magazine 30(9), 12-23 (2015)
Balan, N., Ila, V.: A novel biometric key security system with clustering and convolutional neural network for wsn. Tehnički vjesnik 29(5), 1483-1490 (2022)
Mani, V., Yarlagadda, S.R., Ravipati, S., Swarnamma, S.C.: Ann optimized hybrid energy management control system for electric vehicles. Studies in Informatics and Control 32(1), 101-110 (2023)
Zhang, T., Zhang,W., Dai, L., Chen, J.,Wang, L.,Wei, Q.: Integrated modular avionics system reconstruction method based on sequential game multi-agent reinforcement learning. ACTA ELECTONICA SINICA 50(4), 954-966 (2022)
Wang, H., Zhong, D., Zhao, T.: Avionics system failure analysis and verification based on model checking. Engineering Failure Analysis 105, 373-385 (2019)
Li, X., He, D., Gao, Y., Liu, X., Chan, S., Pan, M., Choo, K.: Light: Lightweight authentication for intra embedded integrated electronic systems. IEEE Transactions on Dependable and Secure Computing 20(2), 1088-1103 (2023)
Kavitha, S., Uma Maheswari, N., Venkatesh, R.: Intelligent intrusion detection system using enhanced arithmetic optimization algorithm with deep learning model. Tehnički vjesnik 30(4), 1217-1224 (2023)
Long, G.L., Shi, L., Xin, G., Gao, S., Zhang, W., Xu, J.: Machine-vision-based online selfoptimizing control system for line marking machines. Studies in Informatics and Control 32(2), 93-104 (2023)
Liu, Y., Zhang, Y., Jiang, Y., Liu,W., Yang, F.: Uwb-ins fusion positioning based on a two-stage optimization algorithm. Tehnički vjesnik 30(1), 185-190 (2023)
dos Santos Mignon, A., de Azevedo da Rocha., R.L.: An adaptive implementation of -greedy in reinforcement learning. Procedia Computer Science 109, 1146-1151 (2017)