Int J Performability Eng ›› 2020, Vol. 16 ›› Issue (11): 1753-1761.

### A Reliability Management System for Network Systems using Deep Learning and Model Driven Approaches

Min Taoa,*, Jiasheng Haoa and Xin Jinb

1. aSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China;
bSchool of Computer Science, University of Electronic Science and Technology of China, Chengdu, 611731, China
• Submitted on  ;  Revised on  ; Accepted on
• Contact: *E-mail address: taomin@uestc.edu.cn

Abstract: Various abnormal observations often occur in a network system, which result in serious reliability damage of the network system. These abnormal observations are much more complicated than those in a traditional system due to the size and varieties of the network system. In fact, an abnormal observation in the network system could be caused by any program, software or application running in the resource pool of the network system. Therefore, there indeed exist a kaleidoscope of causes of abnormal observations that decrease the reliability of the network system. Effectively guaranteeing the reliability of the network system becomes a critical challenge. In this paper, we present a reliability management system (RMS) by using deep learning (DL) and model driven approaches. This RMS can endow the network system with distinctive capabilities of classifying abnormal observations, as well as quantifying and guaranteeing system reliability. The proposed RMS first uses the DL to derive a classification model to find a suitable repair action for an occurred abnormal observation. Then, it also builds a reliability model to evaluate the reliability metric. Now, the RMS can use the reliability model to update the reliability metric of the system after adopting the repair action. If the repair action derived is not suitable, the corresponding classification may have some errors. Then, it would feedback an error to DL for coordinating the classification model. Therefore, the proposed RMS is capable of AI-based anomaly diagnosis and model-driven reliability guarantee.